Modules imported twice. Possible bug in Python interpreter - python

I've made a rather contrived import scheme in a project of mine and I think I might have discovered a bug in the Python interpreter that causes modules to be imported twice.
Here's how my test project is set up:
/
Launcher.bat — Project is run from here. It launches 'main/__init__.py` using the Python 3.2 executable
main/__init__.py — The __main__ script, the one launched by 'Launcher.bat'
main/foo.py — Contains an empty class
external/__init__.py — A script external to the 'main' project scripts, used to demonstrate the problem
./Launcher.bat
#echo off
C:\Python32\python.exe main\__init__.py
pause
./main/__init__.py
from foo import Foo
print("In 'main', Foo has id:", id(Foo))
# Add the directory from which 'Launcher.bat' was run,
# which is not the same as the './main' directory
# in which this script is located
import sys
sys.path.insert(1, '.')
# This script will try to import class Foo, but in doing so
# will casue the interpreter to import this './main/__init__.py'
# script a second time.
__import__('external')
./main/foo.py
class Foo:
pass
./external/__init__.py
from main.foo import Foo
print("In 'external', Foo has id:", id(Foo))
All of this will print the 'Main script was imported' message twice. If the external script imports any other scripts, those too will be imported twice. I've only tested this on Python 3.2. Is this a bug, or did I make a mistake?
Output of the program is:
In 'main', Foo has id: 12955136
In 'main', Foo has id: 12955136
In 'external', Foo has id: 12957456
Press any key to continue . . .

I don't think it's a bug. You should ask on the python-dev list for a more authoritative answer. You are executing once (when you run the script) and importing once (from external) so the line gets printed twice. It's not importing twice.
However, this is a horrible setup. There are a lot of style violations here. Granted that some are for demonstration purposes only, it's still quite messy.
You shouldn't use a package __init__.py file as a file which should be run. The main entry point should be a script that imports the package.
You shouldn't have an imported module importing the module which imported it. Like you external is doing to main.

The first print is misleading: Since you're not importing, but executing the file at the first time (__name__ == '__main__' holds true), the main modules just gets imported once. Move the start point into a secondary file, or check for __name__ == '__main__'.
By the way, circular import are a bad idea. You should resolve the circular import (by moving foo to a dedicated library). Alternatively, you can make your modules reentrant (i.e. check for the current directory being in sys.path before adding it).

Related

Defining and using a decorator function in __init__.py

EDIT: Solved! Solution on the bottom of this post. tl;dr: I should have been using relative imports, and launching python with the correct flag.
So I'm trying to get used to some python stuff (version 2.7.3—it's what's installed on the work PC), and I've got a little project to teach me some python stuff. I've got a module defined like so:
curses_graphics
| __init__.py
| graphicsobject.py
| go_test.py
(code to follow)
I've got a decorator defined in __init__, and I'm trying to use it on methods defined in a class in graphicsobject.py that I'm trying to attach the decorator to, and I'm testing its functionality in go_test.
When I run go_test (just being in the directory and calling "python go_test.py"), I get a no package found error. Same happens if I rename go_test to __main__.py and try to run the package from its parent directory. If I try to run the go_test without importing the package, it doesn't recognise my function.
How can I reference a function defined in __init__.py from within the same package? Is it wrong to try and import while within the package?
Thanks!
__init__.py:
import curses
import logging
#Define logging stuff
gLog = logging.getLogger("cg_log")
gLog.basicConfig(filename="log.log", level=logging.DEBUG)
# Logging decorators for this library
def debug_announce(module_func):
log=logging.getLogger("cg_log")
log.info("Curses Graphics Lib:Entering function:"+module_func.__name__)
module_func()
log.info("Curses Graphics Lib:Left function:"+module_func.__name__)
go_test.py
#debug_announce
def logTester():
print("A test method")
logTester()
logger.debug("Curses initialization")
screen=curses.initscr()
curses.start_color()
screen.keypad(True)
logger.debug("Initializing a green block filled with red '.'s")
block = GraphicsOBject.blankline(0, 0, 3, curses.COLOR_GREEN, curses.COLOR_RED, 20, '.')
logger.debug("Drawing the block.")
block.draw(screen)
logger.debug("Pausing execution with a getch")
screen.getch()
logger.debug("Cleaning up curses")
screen.keypad(False)
curses.endwin()
os.system('stty sane')
I can include graphicsobject.py, though I suspect that would be clutter here, as the issue occurs on the first line of go_test.py
Thanks, everyone!
EDIT:
I'm attaching a capture of the errors reported. In the first error, I've added "from curses_graphics import debug_announce" and in the second the code doesn't have that line.
Errors with and without debug_announce import
EDIT:
Some further searching led me to relative imports. For anyone who has my issue, you use those to import something defined in your own module. In my case, I appended "from . import debug_announce" to the head of my go_test.py file. I attempted to run it, and received the error “Attempted relative import in non-package”.
Some further searching led me to this question:
How to fix "Attempted relative import in non-package" even with __init__.py
Which told me that it wasn't attempting to run this program as a package. This meant the "__init__.py" was never running, as the package wouldn't be imported. Further, since I was inside the package, attempting to import the package (i.e. "import curses_graphics") would result in it searching inside curses_graphics... for curses_graphics.
To run this correctly, as the linked question implies, I need to go to the parent directory of curses_graphics and execute it with "python -m curses_graphics.go_test". This runs the package with its inits and such and run the script of go_test.
(And, FWIW, my code had other issues. Don't take this as an example of how to use curses or logging :P)

Exit from imported script in python

abc.py:
import xyz
#also import other required modules
#other required code
xyz.func()
#other required code
xyz.py:
import threading
import sys
#few threads
def func():
#start all threads
threads.start()
threads.join(timeout)
#if not joined within given time, terminate all threads and exit from xyz.py
if (threads.isAlive()):
sys.exit()
#I used sys.exit() but it is not working for me.What else can I do here? `
I have a python script abc.py that imports another python script xyz.py.
I am using sys.exit() function in xyz.py. Problem I am facing here is when sys.exit() function is executed in xyz.py file, my main file i.e. abc.py also gets terminated. I don't want this to happen. My file abc.py should remain ON even though xyz.py file terminates. Is there any way to achieve this? I will be grateful to receive any help/guidance.I am using python 2.6.6 on centos 6.5.
The problem that xyz.py is in a same type is module and script. There is common construct if __name__ == '__main__' that allow to separate script part in xyz.py from module part. More information here: What does if __name__ == “__main__”: do?
Also, you misunderstanding how import works. There is no such thing as termination of abc.py or xyz.py, there is single interpreter which maintains global namespace containing abc.py objects. When interpreter meets import xyz statement it simply adds name xyz to namespace and to build its content, it interprets statements in that file. When it meets sys.exit(0) it executes it thus exiting interpreter itself.
May be you need to keep both files as a scripts and keep interpreters separate? Then instead of importing, use subprocess.call, i.e:
import subprocess
subprocess.call([sys.executable, 'xyz.py'])

Python defines my class twice on the same thread

I had assumed that in Python if I do some
class A:
print("hi")
this "hi" would only ever be printed more than once if I explicitly deleted A with some del A
In my slightly bigger project I have this code (names changed):
class A(ISomething):
print(threading.current_thread())
try:
A.MY_DICT # yeah, this should never ever result in anything but an error but neither should this code be run twice
print(A.MY_DICT)
except NameError:
print("A.MY_DICT unknown")
MY_DICT = {}
and it produces this output:
$ python main.py
<_MainThread(MainThread, started 140438561298240)>
A.MY_DICT unknown
<_MainThread(MainThread, started 140438561298240)>
A.MY_DICT unknown
so on the same thread the same class level code gets executed twice. How is that possible when I never del A? The code had worked before but I don't have commits to narrow down the change that broke it.
The same code with MY_DICT instead of A.MY_DICT fails equally and as PyDev already at time of writing tells me that this will not work, I am pretty confident that there's something fishy going on.
You are probably importing the file under different names, or running it as both the __main__ file and importing it.
When Python runs your script (the file named on the command line) it gives it the name __main__, which is a namespace stored under sys.modules. But if you then import that same file using an import statement, it'll be run again and the resulting namespace is stored under the module name.
Thus, python main.py where main.py includes an import main statement or imports other code that then in turn imports main again will result in all the code in main.py to be run twice.
Another option is that you are importing the module twice under different full names; both as part of a package and as a stand-alone module. This can happen when both the directory that contains the package and the package itself are listed on your sys.path module search path.

How to prevent a module from being imported twice?

When writing python modules, is there a way to prevent it being imported twice by the client codes? Just like the c/c++ header files do:
#ifndef XXX
#define XXX
...
#endif
Thanks very much!
Python modules aren't imported multiple times. Just running import two times will not reload the module. If you want it to be reloaded, you have to use the reload statement. Here's a demo
foo.py is a module with the single line
print("I am being imported")
And here is a screen transcript of multiple import attempts.
>>> import foo
Hello, I am being imported
>>> import foo # Will not print the statement
>>> reload(foo) # Will print it again
Hello, I am being imported
Imports are cached, and only run once. Additional imports only cost the lookup time in sys.modules.
As specified in other answers, Python generally doesn't reload a module when encountering a second import statement for it. Instead, it returns its cached version from sys.modules without executing any of its code.
However there are several pitfalls worth noting:
Importing the main module as an ordinary module effectively creates two instances of the same module under different names.
This occurs because during program startup the main module is set up with the name __main__. Thus, when importing it as an ordinary module, Python doesn't detect it in sys.modules and imports it again, but with its proper name the second time around.
Consider the file /tmp/a.py with the following content:
# /tmp/a.py
import sys
print "%s executing as %s, recognized as %s in sys.modules" % (__file__, __name__, sys.modules[__name__])
import b
Another file /tmp/b.py has a single import statement for a.py (import a).
Executing /tmp/a.py results in the following output:
root#machine:/tmp$ python a.py
a.py executing as __main__, recognized as <module '__main__' from 'a.py'> in sys.modules
/tmp/a.pyc executing as a, recognized as <module 'a' from '/tmp/a.pyc'> in sys.modules
Therefore, it is best to keep the main module fairly minimal and export most of its functionality to an external module, as advised here.
This answer specifies two more possible scenarios:
Slightly different import statements utilizing different entries in sys.path leading to the same module.
Attempting another import of a module after a previous one failed halfway through.

module reimported if imported from different path

In a big application I am working, several people import same modules differently e.g.
import x
or
from y import x
the side effects of that is x is imported twice and may introduce very subtle bugs, if someone is relying on global attributes
e.g. suppose I have a package mypakcage with three file mymodule.py, main.py and init.py
mymodule.py contents
l = []
class A(object): pass
main.py contents
def add(x):
from mypackage import mymodule
mymodule.l.append(x)
print "updated list",mymodule.l
def get():
import mymodule
return mymodule.l
add(1)
print "lets check",get()
add(1)
print "lets check again",get()
it prints
updated list [1]
lets check []
updated list [1, 1]
lets check again []
because now there are two lists in two different modules, similarly class A is different
To me it looks serious enough because classes itself will be treated differently
e.g. below code prints False
def create():
from mypackage import mymodule
return mymodule.A()
def check(a):
import mymodule
return isinstance(a, mymodule.A)
print check(create())
Question:
Is there any way to avoid this? except enforcing that module should be imported one way onyl. Can't this be handled by python import mechanism, I have seen several bugs related to this in django code and elsewhere too.
Each module namespace is imported only once. Issue is, you're importing them differently. On the first you're importing from the global package, and on the second you're doing a local, non-packaged import. Python sees modules as different. The first import is internally cached as mypackage.mymodule and the second one as mymodule only.
A way to solve this is to always use absolute imports. That is, always give your module absolute import paths from the top-level package onwards:
def add(x):
from mypackage import mymodule
mymodule.l.append(x)
print "updated list",mymodule.l
def get():
from mypackage import mymodule
return mymodule.l
Remember that your entry point (the file you run, main.py) also should be outside the package. When you want the entry point code to be inside the package, usually you use a run a small script instead. Example:
runme.py, outside the package:
from mypackage.main import main
main()
And in main.py you add:
def main():
# your code
I find this document by Jp Calderone to be a great tip on how to (not) structure your python project. Following it you won't have issues. Pay attention to the bin folder - it is outside the package. I'll reproduce the entire text here:
Filesystem structure of a Python project
Do:
name the directory something
related to your project. For example,
if your project is named "Twisted",
name the top-level directory for its
source files Twisted. When you do
releases, you should include a version
number suffix: Twisted-2.5.
create a directory Twisted/bin and
put your executables there, if you
have any. Don't give them a .py
extension, even if they are Python
source files. Don't put any code in
them except an import of and call to a
main function defined somewhere else
in your projects.
If your project
is expressable as a single Python
source file, then put it into the
directory and name it something
related to your project. For example,
Twisted/twisted.py. If you need
multiple source files, create a
package instead (Twisted/twisted/,
with an empty
Twisted/twisted/__init__.py) and
place your source files in it. For
example,
Twisted/twisted/internet.py.
put
your unit tests in a sub-package of
your package (note - this means that
the single Python source file option
above was a trick - you always need at
least one other file for your unit
tests). For example,
Twisted/twisted/test/. Of course,
make it a package with
Twisted/twisted/test/__init__.py.
Place tests in files like
Twisted/twisted/test/test_internet.py.
add Twisted/README and Twisted/setup.py to explain and
install your software, respectively,
if you're feeling nice.
Don't:
put your source in a directory
called src or lib. This makes it
hard to run without installing.
put
your tests outside of your Python
package. This makes it hard to run the
tests against an installed version.
create a package that only has a
__init__.py and then put all your
code into __init__.py. Just make a
module instead of a package, it's
simpler.
try to come up with
magical hacks to make Python able to
import your module or package without
having the user add the directory
containing it to their import path
(either via PYTHONPATH or some other
mechanism). You will not correctly
handle all cases and users will get
angry at you when your software
doesn't work in their environment.
I can only replicate this if main.py is the file you are actually running. In that case you will get the current directory of main.py on the sys path. But you apparently also have a system path set so that mypackage can be imported.
Python will in that situation not realize that mymodule and mypackage.mymodule is the same module, and you get this effect. This change illustrates this:
def add(x):
from mypackage import mymodule
print "mypackage.mymodule path", mymodule
mymodule.l.append(x)
print "updated list",mymodule.l
def get():
import mymodule
print "mymodule path", mymodule
return mymodule.l
add(1)
print "lets check",get()
add(1)
print "lets check again",get()
$ export PYTHONPATH=.
$ python mypackage/main.py
mypackage.mymodule path <module 'mypackage.mymodule' from '/tmp/mypackage/mymodule.pyc'>
mymodule path <module 'mymodule' from '/tmp/mypackage/mymodule.pyc'>
But add another mainfile, in the currect directory:
realmain.py:
from mypackage import main
and the result is different:
mypackage.mymodule path <module 'mypackage.mymodule' from '/tmp/mypackage/mymodule.pyc'>
mymodule path <module 'mypackage.mymodule' from '/tmp/mypackage/mymodule.pyc'>
So I suspect that you have your main python file within the package. And in that case the solution is to not do that. :-)

Categories

Resources