When writing python modules, is there a way to prevent it being imported twice by the client codes? Just like the c/c++ header files do:
#ifndef XXX
#define XXX
...
#endif
Thanks very much!
Python modules aren't imported multiple times. Just running import two times will not reload the module. If you want it to be reloaded, you have to use the reload statement. Here's a demo
foo.py is a module with the single line
print("I am being imported")
And here is a screen transcript of multiple import attempts.
>>> import foo
Hello, I am being imported
>>> import foo # Will not print the statement
>>> reload(foo) # Will print it again
Hello, I am being imported
Imports are cached, and only run once. Additional imports only cost the lookup time in sys.modules.
As specified in other answers, Python generally doesn't reload a module when encountering a second import statement for it. Instead, it returns its cached version from sys.modules without executing any of its code.
However there are several pitfalls worth noting:
Importing the main module as an ordinary module effectively creates two instances of the same module under different names.
This occurs because during program startup the main module is set up with the name __main__. Thus, when importing it as an ordinary module, Python doesn't detect it in sys.modules and imports it again, but with its proper name the second time around.
Consider the file /tmp/a.py with the following content:
# /tmp/a.py
import sys
print "%s executing as %s, recognized as %s in sys.modules" % (__file__, __name__, sys.modules[__name__])
import b
Another file /tmp/b.py has a single import statement for a.py (import a).
Executing /tmp/a.py results in the following output:
root#machine:/tmp$ python a.py
a.py executing as __main__, recognized as <module '__main__' from 'a.py'> in sys.modules
/tmp/a.pyc executing as a, recognized as <module 'a' from '/tmp/a.pyc'> in sys.modules
Therefore, it is best to keep the main module fairly minimal and export most of its functionality to an external module, as advised here.
This answer specifies two more possible scenarios:
Slightly different import statements utilizing different entries in sys.path leading to the same module.
Attempting another import of a module after a previous one failed halfway through.
Related
I'm new to python and doing an assignment. It's meant to be done with linux but as I'm doing it by myself on my own computer I'm doing it on windows.
I've been trying to do this test system that we use looking like this:
>>> import file
>>> file.function(x)
"Answer that we want"
Then we run it through the linux terminal. I've been trying to create my own way of doing this by making a test file which imports the file and runs the function. But then on the other hand of just running the function it runs the whole script. Even though it's never been called to do that.
Import file
file.function(x)
That's pretty much what I've been doing but it runs the whole "file". I've also tried from file import function; it does the same.
What kind of script can I use to script the "answer that I want" for the testfile? As when we run in through linux terminal it says if it has failed or scored.
importing a file is equivalent to running it.
When you import a file (module), a new module object is created, and upon executing the module, every new identifier is put into the object as an attribute.
So if you don't want the module to do anything upon importing, rewrite it so it only has assignments and function definitions.
If you want it to run something only when invoked directly, you can do
A = whatever
def b():
...
if __name__ == '__main__'
# write code to be executed only on direct execution, but not on import
# This is because direct execution assigns `'__main__'` to `__name__` while import of any way assigns the name under which it is imported.
This holds no matter if you do import module or from module import function, as these do the same. Only the final assignment is different:
import module does:
Check sys.modules, and if the module name isn't contained there, import it.
Assign the identifier module to the module object.
from module import function does
Check sys.modules, and if the module name isn't contained there, import it. (Same step as above).
Assign the identifier function to the module object's attribute function.
You can check if the module is imported or executed with the __name__ attribute. If the script is executed the attribute is '__main__'.
It is also good style to define a main function that contains the code that should be executed.
def main()
# do something
pass
if __name__ == '__main__'
main()
Consider the following:
a.py
foo = 1
b.py
bar = 2
c.py
import a
kik = 3
d.py
import a
import c
def main():
import b
main()
main()
How many times is a.py loaded?
How many times is b.py loaded?
More generally, I would like to know how is Python handling imported files and functions/variables?
Both a and b are loaded once. When you import a module, its content is cached so when you load the same module again, you're not calling upon the original script for the import, done using a "finder":
https://www.python.org/dev/peps/pep-0451/#finder
https://docs.python.org/3/library/importlib.html#importlib.abc.MetaPathFinder
This works across modules so if you had a d.py of which import b, it will bind to the same cache as an import within c.py.
Some interesting builtin modules can help understand what happens during an import:
https://docs.python.org/3/reference/import.html#importsystem
When a module is first imported, Python searches for the module and if found, it creates a module object 1, initializing it.
Notably here the first import, all imports after follow the __import__. Internal caches of finders are stored at sys.meta_path.
https://docs.python.org/3/library/functions.html#import
You can leverage the import system to invalidate those caches for example:
https://docs.python.org/3/library/importlib.html#importlib.import_module
If you are dynamically importing a module that was created since the interpreter began execution (e.g., created a Python source file), you may need to call invalidate_caches() in order for the new module to be noticed by the import system.
The imp (and importlib py3.4+) allows the recompilation of a module after import:
import imp
import a
imp.reload(a)
https://docs.python.org/3/library/importlib.html#importlib.reload
Python module’s code is recompiled and the module-level code re-executed, defining a new set of objects which are bound to names in the module’s dictionary by reusing the loader which originally loaded the module.
https://docs.python.org/3/library/imp.html
To illustrate the issue I am having, please consider the following. I have two .py files, one named main.py and the other named mymodule.py. They are both in the same directory.
The contents of main.py:
from mymodule import myfunction
myfunction()
The contents of mymodule.py:
def myfunction():
for number in range(0,10):
print(number)
print("Hi")
I was under the impression that importing a function would only import that function. However, when I run main.py, this is what I get:
Hi
0
1
2
3
4
5
6
7
8
9
Why is print("Hi") being called? It isn't part of the function I imported!
I was under the impression that importing a function would only import that function.
It seems there's an incorrect assumption about what a from-import actually does.
The first time a module is imported, an import statement will execute the entire module, including print calls made at the global scope (docs). This is true regardless of whether the mymodule was first imported by using a statement like import mymodule or by using a statement like from mymodule import myfunction.
Subsequent imports of the same module will re-use an existing module cached in sys.modules, which may be how you arrived at the misunderstanding that the entire module is not executed.
There is a common pattern to avoid global level code being executed by a module import. Often you will find code which is not intended to be executed at import time located inside a conditional, like this:
def myfunction():
for number in range(0,10):
print(number)
if __name__ == "__main__":
print("Hi")
In order to import something from the module Python needs to load this module first. At that moment all the code at module-level is executed.
According to the docs:
A module can contain executable statements as well as function
definitions. These statements are intended to initialize the module.
They are executed only the first time the module name is encountered
in an import statement.
this question seems to be a duplicate of this one.
In short : all the code of a python file is called when importing the module. What is neither a function nor a class is usually put in a main function called here:
if __name__ == "__main__":
# stuff only to run when not called via 'import' here
main()
Please consider closing this thread.
Let's say i have a really long script.(1000+ lines long, in my case) so i split it into sepperate files:
main.py #the file i execute
foo1.py #a file my main.py imports
foo2.py #a file imported by foo1.py
(note: main.py imports several files, not just the one)
Foo1.py holds Tkinter, and things related to it, while Foo2.py holds a huge object class with functions related to said class.
My problem is as follows:
Foo1 imports Foo2
Foo2 runs a function that calls another function from Foo1
Foo2 raises a 'global name ' is not defined' error
And also i can't import the function into Foo2, because Foo1 already has it and that raises an import error.
When two modules import each other there are a few things you need to keep in mind so everything is defined before it is needed.
First lets consider the mechanic of importing:
when a module is imported for the first time an entry is added to sys.modules and the defining file starts executing (pausing the execution of the import-er)
subsequent imports will simply use the entry in sys.modules - whether or not the file finished executing
So lets say module A is loaded first, imports module B which then imports module A, when this happens execution is as follows:
A is imported from something else for first time, A is added to sys.modules
A is executed up to importing B
B is added to sys.modules
B is executed:
when it imports A the partially loaded module is used from sys.modules
B runs completely before resuming
A resumes executing, having access to the complete module B
*1 so from A import x can only work if x is defined in A before import B, just using import A will give you the module object which is updated as the file executes, so while it may not have all the definitions right after import it will when the file has a chance to finish executing.
So the simplest way of solving this is to first not rely on the import for the execution of the module - meaning all the uses of the circular import is within def blocks that are not called from the module level of execution.
I had assumed that in Python if I do some
class A:
print("hi")
this "hi" would only ever be printed more than once if I explicitly deleted A with some del A
In my slightly bigger project I have this code (names changed):
class A(ISomething):
print(threading.current_thread())
try:
A.MY_DICT # yeah, this should never ever result in anything but an error but neither should this code be run twice
print(A.MY_DICT)
except NameError:
print("A.MY_DICT unknown")
MY_DICT = {}
and it produces this output:
$ python main.py
<_MainThread(MainThread, started 140438561298240)>
A.MY_DICT unknown
<_MainThread(MainThread, started 140438561298240)>
A.MY_DICT unknown
so on the same thread the same class level code gets executed twice. How is that possible when I never del A? The code had worked before but I don't have commits to narrow down the change that broke it.
The same code with MY_DICT instead of A.MY_DICT fails equally and as PyDev already at time of writing tells me that this will not work, I am pretty confident that there's something fishy going on.
You are probably importing the file under different names, or running it as both the __main__ file and importing it.
When Python runs your script (the file named on the command line) it gives it the name __main__, which is a namespace stored under sys.modules. But if you then import that same file using an import statement, it'll be run again and the resulting namespace is stored under the module name.
Thus, python main.py where main.py includes an import main statement or imports other code that then in turn imports main again will result in all the code in main.py to be run twice.
Another option is that you are importing the module twice under different full names; both as part of a package and as a stand-alone module. This can happen when both the directory that contains the package and the package itself are listed on your sys.path module search path.