calling execfile() in custom namespace executes code in '__builtin__' namespace - python

When I call execfile without passing the globals or locals arguments it creates objects in the current namespace, but if I call execfile and specify a dict for globals (and/or locals), it creates objects in the __builtin__ namespace.
Take the following example:
# exec.py
def myfunc():
print 'myfunc created in %s namespace' % __name__
exec.py is execfile'd from main.py as follows.
# main.py
print 'execfile in global namespace:'
execfile('exec.py')
myfunc()
print
print 'execfile in custom namespace:'
d = {}
execfile('exec.py', d)
d['myfunc']()
when I run main.py from the commandline I get the following output.
execfile in global namespace:
myfunc created in __main__ namespace
execfile in custom namespace:
myfunc created in __builtin__ namespace
Why is it being run in __builtin__ namespace in the second case?
Furthermore, if I then try to run myfunc from __builtins__, I get an AttributeError. (This is what I would hope happens, but then why is __name__ set to __builtin__?)
>>> __builtins__.myfunc()
Traceback (most recent call last):
File "<stdin>", line 1, in ?
AttributeError: 'module' object has no attribute 'myfunc'
Can anyone explain this behaviour?
Thanks

First off, __name__ is not a namespace - its a reference to the name of the module it belongs to, ie: somemod.py -> somemod.__name__ == 'somemod'
The exception to this being if you run a module as an executable from the commandline, then the __name__ is '__main__'.
in your example there is a lucky coincidence that your module being run as main is also named main.
Execfile executes the contents of the module WITHOUT importing it as a module. As such, the __name__ doesn't get set, because its not a module - its just an executed sequence of code.

The execfile function is similar to the exec statement. If you look at the documentation for exec you'll see the following paragraph that explains the behavior.
As a side effect, an implementation may insert additional keys into the dictionaries given besides those corresponding to variable names set by the executed code. For example, the current implementation may add a reference to the dictionary of the built-in module __builtin__ under the key __builtins__ (!).
Edit: I now see that my answer applies to one possible interpretation of the question title. My answer does not apply to the actual question asked.

As an aside, I prefer using __import__() over execfile:
module = __import__(module_name)
value = module.__dict__[function_name](arguments)
This also works well when adding to the PYTHONPATH, so that modules in other directories can be imported:
sys.path.insert(position, directory)

Related

Python 'from x import z' imports more than just 'z' [duplicate]

I've noticed that asyncio/init.py from python 3.6 uses the following construct:
from .base_events import *
...
__all__ = (base_events.__all__ + ...)
The base_events symbol is not imported anywhere in the source code, yet the module still contains a local variable for it.
I've checked this behavior with the following code, put into an __init__.py with a dummy test.py next to it:
test = "not a module"
print(test)
from .test import *
print(test)
not a module
<module 'testpy.test' from 'C:\Users\MrM\Desktop\testpy\test.py'>
Which means that the test variable got shadowed after using a star import.
I fiddled with it a bit, and it turns out that it doesn't have to be a star import, but it has to be inside an __init__.py, and it has to be relative. Otherwise the module object is not being assigned anywhere.
Without the assignment, running the above example from a file that isn't an __init__.py will raise a NameError.
Where is this behavior coming from? Has this been outlined in the spec for import system somewhere? What's the reason behind __init__.py having to be special in this way? It's not in the reference, or at least I couldn't find it.
This behavior is defined in The import system documentation section 5.4.2 Submodules
When a submodule is loaded using any mechanism (e.g. importlib APIs,
the import or import-from statements, or built-in import()) a
binding is placed in the parent module’s namespace to the submodule
object. For example, if package spam has a submodule foo, after
importing spam.foo, spam will have an attribute foo which is bound to
the submodule.
A package namespace includes the namespace created in __init__.py plus extras added by the import system. The why is for namespace consistency.
Given Python’s familiar name binding rules this might seem surprising,
but it’s actually a fundamental feature of the import system. The
invariant holding is that if you have sys.modules['spam'] and
sys.modules['spam.foo'] (as you would after the above import), the
latter must appear as the foo attribute of the former.
This appears to have everything to do with the interplay of how the interpreter resolve variable assignments as the module/submodule level. We may be able to acquire additional information if we instead interrogate what the assignments are using code executed outside the module we are trying to interrogate.
In my example, I have the following:
Code listing for src/example/package/module.py:
from logging import getLogger
__all__ = ['fn1']
logger = getLogger(__name__)
def fn1():
logger.warning('running fn1')
return 'fn1'
Code listing for src/example/package/__init__.py:
def print_module():
print("`module` is assigned with %r" % module)
Now execute the following in the interactive interpreter:
>>> from example.package import print_module
>>> print_module()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/tmp/example.package/src/example/package/__init__.py", line 2, in print_module
print("`module` is assigned with %r" % module)
NameError: name 'module' is not defined
So far so good, the exception looks perfectly normal. Now let's see what happens if example.package.module gets imported:
>>> import example.package.module
>>> print_module()
`module` is assigned with <module 'example.package.module' from '/tmp/example.package/src/example/package/module.py'>
Given that relative import is a short-hand syntax for the full import, let's see what happens if we modify the __init__.py to contain the absolute import rather than relative like what was just done in the interactive interpreter and see what happens now:
import example.package.module
def print_module():
print("`module` is assigned with %r" % module)
Launch the interactive interpreter once more, we see this:
>>> print_module()
`module` is assigned with <module 'example.package.module' from '/tmp/example.package/src/example/package/module.py'>
Note that __init__.py actually represents the module binding example.package, an intuition might be that if example.package.module is imported, the interpreter will then provide an assignment of module to example.package to aid with the resolution of example.package.module, regardless of absolute or relative imports being done. This seems to be a particular quirk of executing code at a module that may have submodules (i.e. __init__.py).
Actually, one more test. Let's see if there is just something weird to do with variable assignments. Modify src/example/package/__init__.py to:
import example.package.module
def print_module():
print("`module` is assigned with %r" % module)
def delete_module():
del module
The new function would test whether or not module was actually assigned to the scope at __init__.py. Executing this we learn that:
>>> from example.package import print_module, delete_module
>>> print_module()
`module` is assigned with <module 'example.package.module' from '/tmp/example.package/src/example/package/module.py'>
>>> delete_module()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/tmp/example.package/src/example/package/__init__.py", line 7, in delete_module
del module
UnboundLocalError: local variable 'module' referenced before assignment
Indeed, it wasn't, so the interpreter is truly resolving the reference at module through the import system, rather than any variable that got assigned to the scope within __init__.py. So the prior intuition was actually wrong but it is rather the interpreter resolving the module name within example.package (even if this is done inside the scope of __init__.py) through the module system once example.package.module was imported.
I haven't looked at the specific PEPs that deals with assignment/name resolutions for modules and imports, but given that this little exercise proved that the issue is not simply reliant on relative imports, and that assignment is triggered regardless when or where the import was done, there might be something there, but this hopefully provided a greater understanding of how Python's import system deals with resolving names relating to imported modules.

Why does deleting a global variable named __builtins__ prevent only the REPL from accessing builtins?

I have a python script with the following contents:
# foo.py
__builtins__ = 3
del __builtins__
print(int) # <- this still works
Curiously, executing this script with the -i flag prevents only the REPL from accessing builtins:
aran-fey#starlight ~> python3 -i foo.py
<class 'int'>
>>> print(int)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'print' is not defined
How come the script can access builtins, but the REPL can't?
CPython doesn't look up __builtins__ every time it needs to do a built-in variable lookup. Each frame object has an f_builtins member holding its built-in variable dict, and built-in variable lookup goes through there.
f_builtins is set on frame object creation. If a new frame has no parent frame (f_back), or a different global variable dict from its parent frame, then frame object initialization looks up __builtins__ to set f_builtins. (If the new frame shares a global dict with its parent frame, then it inherits its parent's f_builtins.) This is the only way __builtins__ is involved in built-in variable lookup. You can see the code that handles this in _PyFrame_New_NoTrack.
When you delete __builtins__ inside a script, that doesn't affect f_builtins. The rest of the code executing in the script's stack frame still sees builtins. Once the script completes and -i drops you into interactive mode, every interactive command gets a new stack frame (with no parent), and the __builtins__ lookup is repeated. This is when the deleted __builtins__ finally matter.
The execution context is different. Within the REPL we are working line-by-line (Read, Eval, Print, Loop), which allows an opportunity for global execution scope to change in between each step. But the runtime executing a module is to load the modules code, and then exec it within a scope.
In CPython, the builtins namespace associated with the execution of a code block is found by looking up the name __builtins__ in the global namespace; this should be bound to a dictionary or a module (in the latter case the module's dictionary is used). When in the __main__ module, __builtins__ is the built-in module builtins, otherwise __builtins__ is bound to the dictionary of the builtins module itself. In both contexts of your question, we are in the __main__ module.
What's important is that CPython only looks up the builtins once, right before it begins executing your code. In the REPL, this happens every time a new statement is executed. But when executing a python script, the entire content of the script is one single unit. That is why deleting the builtins in the middle of the script has no effect.
To more closely replicate that context inside a REPL, you would not enter the code of the module line by line, but instead use a compound statement:
>>> if 1:
... del __builtins__
... print(123)
...
123
>>> print(123)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'print' is not defined
Naturally, you're probably now wondering how to remove builtins from within a script. The answer should be obvious: you can't do it by rebinding a name, but you can do it by mutation:
# foo2.py
__builtins__.__dict__.clear()
print(int) # <- NameError: name 'print' is not defined
As a final note, the fact that __builtins__ name is bound at all is implementation detail of CPython and that is explicitly documented:
Users should not touch __builtins__; it is strictly an implementation detail.
Don't rely on __builtins__ for anything serious, if you need access to that scope the correct way is to import builtins and go from there.

calling separate python program in unit test

i am new to python and unit test.following is the main unittest program that calls other python programs which acts a test cases
import unittest
from test import test_support
class MyTestCase1(unittest.TestCase):
def test_feature_one(self):
print "testing feature one"
execfile("/root/test/add.py")
def test_main():
test_support.run_unittest(MyTestCase1);
if __name__ == '__main__':
test_main()
add.py is basic program that adds two no and displays it.
#!/usr/bin/env python
import sys
def disp(r):
print r
def add():
res = 3+5;
disp(res)
add()
but there is problem when i call a function from another function. i hit the following error when i try to run unit test(first program).But if i run add.py as single program outside the unit test suit it works fine. kindly need help in understanding this scenario
======================================================================
ERROR: test_feature_one (__main__.MyTestCase1)
----------------------------------------------------------------------
Traceback (most recent call last):
File "first.py", line 17, in test_feature_one
execfile("/root/test/add.py")
File "/root/test/add.py", line 12, in <module>
add()
File "/root/test/add.py", line 10, in add
disp(res)
NameError: global name 'disp' is not defined
----------------------------------------------------------------------
From docs on execfile (https://docs.python.org/2/library/functions.html#execfile) :
"Remember that at module level, globals and locals are the same dictionary.
...
If the locals dictionary is omitted it defaults to the globals dictionary. If both dictionaries are omitted, the expression is executed in the environment where execfile() is called."
I'm not very familiar to how exactly globals and locals works here, so I won't be able to give deep explanation, but from what I understood:
the key here is that you're running execfile from function. If you run it from module level, it will work:
if __name__ == '__main__':
execfile('blah')
But if you run it from function:
def f():
execfile('blah')
if __name__ == '__main__':
f()
it will fail. Because of magic with globals and locals.
How to fix your example: add dictionary to arguments of execfile, and it will work (remember that line from docs: "If the locals dictionary is omitted it defaults to the globals dictionary.").
But instead of using execfile, I'd recommend to import add from add.py and just call it in test. (That will also require to move call to add func in add.py to if __name__ == '__main__':, to not run add on imports.
Here some info on how globals and locals work http://www.diveintopython.net/html_processing/locals_and_globals.html .

Bug in Python's documentation?

I am reading http://docs.python.org/2/tutorial/modules.html#more-on-modules and wonder if the following is correct:
Modules can import other modules. It is customary but not required to
place all import statements at the beginning of a module (or script,
for that matter). The imported module names are placed in the
importing module’s global symbol table.
Apparently not:
>>> def foo(): import sys
...
>>> foo()
>>> sys.path
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'sys' is not defined
See http://ideone.com/cLK09v for an online demo.
So, is it a bug in the Python's documentation or I don't understand something?
Yes, this is a documentation error. The import statement imports the names to the current namespace. Usually import is used outside of functions and classes, but as you've discovered, it does work within them. In your example function, the module is imported into the function's local namespace when the function is called. (Which you didn't do, but that wouldn't make it available outside the function anyway.)
The global keyword does work here, however:
def foo():
global sys
import sys
foo()
sys.path
I don't think this is actually an error in the documentation, but more of a mis-interpretation. You simply have a scope issue. You are importing it in the scope of the function foo(). You could certainly do as the documentation suggests and put the import at the bottom of the file or somewhere else in the file that would still have the same global scope as your module. The problem is "The imported module names are placed in the importing module’s global symbol table", where the scope of the module you are importing into is contained in the function foo(), not at the module's global level.

Some confusion regarding imports in Python

I'm new to Python and there's something that's been bothering me for quite some time. I read in "Learning Python" by Mark Lutz that when we use a from statement to import a name present in a module, it first imports the module, then assigns a new name to it (i.e. the name of the function, class, etc. present in the imported module) and then deletes the module object with the del statement. However what happens if I try to import a name using from that references a name in the imported module that itself is not imported? Consider the following example in which there are two modules mod1.py and mod2.py:
#mod1.py
from mod2 import test
test('mod1.py')
#mod2.py
def countLines(name):
print len(open(name).readlines())
def countChars(name):
print len(open(name).read())
def test(name):
print 'loading...'
countLines(name)
countChars(name)
print '-'*10
Now see what happens when I run or import mod1:
>>>import mod1
loading...
3
44
----------
Here when I imported and ran the test function, it ran successfully although I didn't even import countChars or countLines, and the from statement had already deleted the mod2 module object.
So I basically need to know why this code works even though considering the problems I mentioned it shouldn't.
EDIT: Thanx alot to everyone who answered :)
Every function have a __globals__ attribute which holds a reference for the environment where it search for global variables and functions.
The test function is then linked to the global variables of mod2. So when it calls countLines the interpreter will always find the right function even if you wrote a new one with the same name in the module importing the function.
I think you're wrestling with the way python handles namespaces. when you type from module import thing you are bringing thing from module into your current namespace. So, in your example, when mod1 gets imported, the code is evaluated in the following order:
from mod2 import test #Import mod2, bring test function into current module namespace
test("mod1.py") #run the test function (defined in mod2)
And now for mod2:
#create a new function named 'test' in the current (mod2) namespace
#the first time this module is imported. Note that this function has
#access to the entire namespace where it is defined (mod2).
def test(name):
print 'loading...'
countLines(name)
countChars(name)
print '-'*10
The reason that all of this is important is because python lets you choose exactly what you want to pull into your namespace. For example, say you have a module1 which defines function cool_func. Now you are writing another module (module2) and it makes since for module2 to have a function cool_func also. Python allows you to keep those separate. In module3 you could do:
import module1
import module2
module1.cool_func()
module2.cool_func()
Or, you could do:
from module1 import cool_func
import module2
cool_func() #module1
module2.cool_func()
or you could do:
from module1 import cool_func as cool
from module2 import cool_func as cooler
cool() #module1
cooler() #module2
The possibilities go on ...
Hopefully my point is clear. When you import an object from a module, you are choosing how you want to reference that object in your current namespace.
The other answers are better articulated than this one, but if you run the following you can see that countChars and countLines are actually both defined in test.__globals__:
from pprint import pprint
from mod2 import test
pprint(test.__globals___)
test('mod1')
You can see that importing test brings along the other globals defined in mod2, letting you run the function without worrying about having to import everything you need.
Each module has its own scope. Within mod1, you cannot use the names countLines or countChars (or mod2).
mod2 itself isn't affected in the least by how it happens to be imported elsewhere; all names defined in it are available within the module.
If the webpage you reference really says that the module object is deleted with the del statement, it's wrong. del only removes names, it doesn't delete objects.
From A GUIDE TO PYTHON NAMESPACES,
Even though modules have their own global namespaces, this doesn’t mean that all names can be used from everywhere in the module. A scope refers to a region of a program from where a namespace can be accessed without a prefix. Scopes are important for the isolation they provide within a module. At any time there are a number of scopes in operation: the scope of the current function you’re in, the scope of the module and then the scope of the Python builtins. This nesting of scopes means that one function can’t access names inside another function.
Namespaces are also searched for names inside out. This means that if there is a certain name declared in the module’s global namespace, you can reuse the name inside a function while being certain that any other function will get the global name. Of course, you can force the function to use the global name by prefixing the name with the ‘global’ keyword. But if you need to use this, then you might be better off using classes and objects.
An import statement loads the whole module in memory so that's why the test() function ran successfully.
But as you used from statement that's why you can't use the countLines and countChars directly but test can surely call them.
from statement basically loads the whole module and sets the imported function, variable etc to the global namespace.
for eg.
>>> from math import sin
>>> sin(90) #now sin() is a global variable in the module and can be accesed directly
0.89399666360055785
>>> math
Traceback (most recent call last):
File "<pyshell#2>", line 1, in <module>
math
NameError: name 'math' is not defined
>>> vars() #shows the current namespace, and there's sin() in it
{'__builtins__': <module '__builtin__' (built-in)>, '__file__': '/usr/bin/idle', '__package__': None, '__name__': '__main__', 'main': <function main at 0xb6ac702c>, 'sin': <built-in function sin>, '__doc__': None}
consider a simple file, file.py:
def f1():
print 2+2
def f2():
f1()
import only f2:
>>> from file import f2
>>> f2()
4
though I only imported f2() not f1() but it ran f1() succesfully it's because the module is loaded in memory but we can only access f2(), but f2() can access other parts of the module.

Categories

Resources