Understanding `from ... import ...` behavior [duplicate] - python

I wonder about why import a variable in python (python 3.4) has different result than importing a module and then referencing, more over why does a deep copy is made - and is there a way to bypass the copy (and not by defining a function that simply returns it)?
a.py
v = 1
def set():
global v
v = 3
main.py
import a
import b
a.set()
b.foo()
b.py
from a import v
def foo():
print(v)
print(a.v)
print(id(v))
print(id(a.v))
Output
1
3
1585041872
1585041904

The problem is that you're modifying a scalar value. This is not a problem specific to modules, it would work the same when simply passing the variable into a function and modifying it there.
The value 1 is imported from a, period. Whatever you do in a afterwards will not modify the value, because it's a simple immutable scalar value.
If a.v was an object, changes to this object would propagate to any variable holding a reference to it.

Asked a duplicate question myself, and with the help of others I figured out what it is. Here's what I found out. With pydoc links:
from a import v does not add a reference to a.v. Instead, it add a new variable to b as b.v with the value of a.v when import happened. Changing a.v later does not change the value of b.v.
Python 2
The from form does not bind the module name: it goes through the list of identifiers, looks each one of them up in the module found in step (1), and binds the name in the local namespace to the object thus found.
Python 3
The from form uses a slightly more complex process:
find the module specified in the from clause, loading and initializing it if necessary;
for each of the identifiers specified in the import clauses:
check if the imported module has an attribute by that name
if not, attempt to import a submodule with that name and then check the imported module again for that attribute
if the attribute is not found, ImportError is raised.
otherwise, a reference to that value is stored in the local namespace, using the name in the as clause if it is present, otherwise using the attribute name
The keyword here is in the local namespace.

Let's examine the sequence of events:
a.v = 1 # a.py: v = 1
b.v = a.v # b.py: from a import v
a.v = 3 # a.set()
print(b.v) # foo(): print(v)
print(a.v) # foo(): print(a.v)
As you can see, from a import v actually binds b.v to a value from a, and later modification to the original variable don't affect the copy.

When you say import a, you are creating a reference to the module. a.v is not copied. I change in one module is noticed in all modules. When you say from a import v you are making a copy of v at the time of the import. If either variable is changed, it is not reflected elsewhere.

Related

how does import in python work? When import does python run the code in it? [duplicate]

I have two specific situations where I don't understand how importing works in Python:
1st specific situation:
When I import the same module in two different Python scripts, the module isn't imported twice, right? The first time Python encounters it, it is imported, and second time, does it check if the module has been imported, or does it make a copy?
2nd specific situation:
Consider the following module, called bla.py:
a = 10
And then, we have foo.py, a module which imports bla.py:
from bla import *
def Stuff ():
return a
And after that, we have a script called bar.py, which gets executed by the user:
from foo import *
Stuff() #This should return 10
a = 5
Stuff()
Here I don't know: Does Stuff() return 10 or 5?
Part 1
The module is only loaded once, so there is no performance loss by importing it again. If you actually wanted it to be loaded/parsed again, you'd have to reload() the module.
The first place checked is sys.modules, the cache of all modules that have been imported previously. [source]
Part 2
from foo import * imports a to the local scope. When assigning a value to a, it is replaced with the new value - but the original foo.a variable is not touched.
So unless you import foo and modify foo.a, both calls will return the same value.
For a mutable type such as a list or dict it would be different, modifying it would indeed affect the original variable - but assigning a new value to it would still not modify foo.whatever.
If you want some more detailed information, have a look at http://docs.python.org/reference/executionmodel.html:
The following constructs bind names: formal parameters to functions, import statements, class and function definitions (these bind the class or function name in the defining block), and targets that are identifiers if occurring in an assignment, for loop header, in the second position of an except clause header or after as in a with statement.
The two bold sections are the relevant ones for you: First the name a is bound to the value of foo.a during the import. Then, when doing a = 5, the name a is bound to 5. Since modifying a list/dict does not cause any binding, those operations would modify the original one (b and foo.b are bound to the same object on which you operate). Assigning a new object to b would be a binding operation again and thus separate b from foo.b.
It is also worth noting what exactly the import statement does:
import foo binds the module name to the module object in the current scope, so if you modify foo.whatever, you will work with the name in that module - any modifications/assignments will affect the variable in the module.
from foo import bar binds the given name(s) only (i.e. foo will remain unbound) to the element with the same name in foo - so operations on bar behave like explained earlier.
from foo import * behaves like the previous one, but it imports all global names which are not prefixed with an underscore. If the module defines __all__ only names inside this sequence are imported.
Part 3 (which doesn't even exist in your question :p)
The python documentation is extremely good and usually verbose - you find answer on almost every possible language-related question in there. Here are some useful links:
http://docs.python.org/reference/datamodel.html (classes, properties, magic methods, etc.) ()
http://docs.python.org/reference/executionmodel.html (how variables work in python)
http://docs.python.org/reference/expressions.html
http://docs.python.org/reference/simple_stmts.html (statements such as import, yield)
http://docs.python.org/reference/compound_stmts.html (block statements such as for, try, with)
To answer your first question:
No, python does not get 'imported' twice. When python loads a module, it checks for the module in sys.modules. If it is not in there, it is put in there, and loaded.
To answer your second question:
Modules can define what names they will export to a from camelot import * scenario, and the behavior is to create names for the existing values, not to reference existing variables (python does not have references).
On a somewhat related topic, doing a from camelot import * is not the same as a regular import.

Does mutability change behaviour of namespace in case of 'from modulename import variable'?

If we update the class variable via instance, then a new instance variable gets created for that instance only. However if the class variable is mutable (say list), then making change to class variable via instance using append makes the change global across all instances.
Similarly, in case of from modulename import x, we know that if x is int/string then changing value of x in calling module is only visible in calling module and not globally. However, if x is mutable, then does mutability have any impact on behaviour with respect to namespace? For example, does changing value of x in the calling module updates the value in the global namespace? Or something else?
Python does not make a copy of an object on import. The from module import foo syntax simply assigns a specific object from the module to a name in your scope. Furthermore, a module is only loaded once, so any subsequent import would recover the same object as well.
Thus, if you import a mutable object and update it, this change is reflected everywhere that object has been imported, however that import took place. Keep in mind that this happens because the imported object and the module object are the same in memory.
Example
Here is an example demonstrating that behaviour.
module.py
lst = []
def print_lst():
print(lst)
main.py
from module import lst
from module import print_lst
# Print the list initially
print_lst()
# Append to the list in this scope
lst.append(1)
# Print the list from inside the module
print_lst()
# Importing the module again does not reload it
import module
# Proof that the list was not copied
print(module.lst is lst)
Output
[]
[1]
True
An old story. In short, when you import something, you get a copy into the current namespace. That's exactly the same as the following example:
a = [1, 2, 3]
b = a # get a copy
b.append(4)
print(a) # [1, 2, 3, 4]
a = 1
b = a # get a copy
b = 2
print(a) # 1

python: state of imported global variables holding a reference to lambda function

Backstory:
I was trying implementing one way to handle -v parameters to increase the verbosity of an application. To that end, I wanted to use a global variable that is pointing to an empty lambda function initially. If -v is given the variable is changed and gets another lambda function assigned that does print it input.
MWE:
I noticed that this did not work as expected when calling the lambda function from another module after importing it via from x import *...
mwe.py:
from mod import *
import mod
def f():
vprint("test in f")
vprint("test before")
print("before: %d" % foo)
set_verbosity(1)
vprint("test after")
print("after: %d" % foo)
f()
mod.vprint("explicit: %d" % mod.foo)
modf()
mod.py:
vprint = lambda *a, **k: None
foo = 42
def set_verbosity(verbose):
global vprint, foo
if verbose > 0:
vprint = lambda *args, **kwargs: print(*args, **kwargs)
foo = 0
def modf():
vprint("modf: %d" % foo)
The output is
before: 42
after: 42
explicit: 0
modf: 0
where the "explicit" and "modf" outputs are due to the mod.vprint and modf calls at the end of the mwe. All other invocations of vprint (that go through the imported version of vprint) are apparently not using the updated definition. Likewise, the value of foo seems to be imported only once.
Question:
It looks to me as if the from x import * type of imports copies the state of the globals of the imported module. I am not so much interested in workarounds per se but the actual reason for this behavior. Where is this defined in the documentation and what's the rationale?
Workaround:
As a side note, one way to implement this anyway is to wrap the global lambda variables by functions and export only them:
_vprint = lambda *a, **k: None
def vprint(*args, **kwargs):
_vprint(*args, **kwargs)
def set_verbosity(verbose):
global _vprint
if verbose > 0:
_vprint = lambda *args, **kwargs: print(*args, **kwargs)
This makes it work with the import-from way that allows the other module to simply call vprint instead of explicitly deferencing via the module's name.
TL;DR: when you do from module import *, you're copying the names and their associated references; changing the reference associated with the original name does not change the reference associated with the copy.
This deals with the underlying difference between names and references. The underlying reason for the behavior has to do with how python handles such things.
Only one thing is truly immutable in python: memory. You can't change individual bytes directly. However, almost everything in python deals with references to individual bytes, and you can change those references. When you do my_list[2] = 5, you're not changing any memory - rather, you're allocating a new block of memory to hold the value 5, and pointing the second index of my_list to it. The original data that my_list[2] used to be pointing at is still there, but since nothing refers to it any more, the garbage collector will take care of it eventually and free the memory it was using.
The same principle goes with names. Any given namespace in python is comparable to a dict - each name has a corresponding reference. And this is where the problems come in.
Consider the difference between the following two statements:
from module import *
import module
In both cases, module is loaded into memory.
In the latter case, only one thing is added to the local namespace - the name 'module', which references the entire memory block containing the module that was just loaded. Or, well, it references the memory block of that module's own namespace, which itself has references to all the names in the module, and so on all the way down.
In the former case, however, every name in module's namespace is copied into the local namespace. The same block of memory still exists, but instead of having one reference to all of it, you now have many references to small parts of it.
Now, let's say we do both of those statements in succession:
from module import *
import module
This leaves us with one name 'module' referencing all the memory the module was loaded into, and a bunch of other names that reference individual parts of that block. We can verify that they point to the same thing:
print(module.func_name == func_name)
# True
But now, we try to assign something else to module.attribute:
module.func_name = lambda x:pass
print(module.func_name == func_name)
# False
It's no longer the same. Why?
Well, when we did module.func_name = lambda x:pass, we first allocated some memory to store lambda x:pass, and then we changed module's 'func_name' name to reference that memory instead of what it was referencing. Note that, like the example I gave earlier with lists, we didn't change the thing that the module.func_name was previously referencing - it still exists, and the local func_name continues to reference it.
So when you do from module import *, you're copying the names and their associated references; changing the reference associated with the original name does not change the reference associated with the copy.
The workaround for this is to not do import *. In fact, this is pretty much the ultimate reason why using import * is usually considered poor practice, save for a handful of special cases. Consider the following code:
# module.py
variable = "Original"
# file1.py
import module
def func1():
module.variable = "New"
# file2.py
import module
import file1
print(module.variable)
file1.func1()
print(module.variable)
When you run python file2.py, you get the following output:
Original
New
Why? Because file1 and file2 both imported module, and in both of their namespaces 'module' is pointing to the same block of memory. module's namespace contains a name 'variable' referencing some value. Then, the following things happen:
file2 says "okay, module, please give me the value associated with the name 'variable' in your namespace.
file1.func1() says "okay, module, the name 'variable' in your namespace now references this other value.
file2 says "okay, module, please give me the value associated with the name 'variable' in your namespace.
Since file1 and file2 are still both talking to the same bit of memory, they stay coordinated.
Random stab in the dark ahead:
If you look at the docs for __import__, there's the bit:
On the other hand, the statement from spam.ham import eggs, sausage as saus results in
_temp = __import__('spam.ham', globals(), locals(), ['eggs', 'sausage'], 0)
eggs = _temp.eggs
saus = _temp.sausage
I think this is the key. If we infer that from mod import * results in something like:
_temp = __import__('mod', globals(), locals(), [], 0)
printv = _temp.printv
foo = _temp.foo
This shows what the problem is. printv is a reference to the old version of printv; what mod.printv was pointing to at the time of import. Reassigning what the printv in mod is pointing to doesn't effect anything in mwe, because the mwe reference to printv is still looking at the previous lambda.
It's similar to how this doesn't change b:
a = 1
b = a
a = 2
b is still pointing to 1, because reassigning a doesn't effect what b is looking at.
On the other hand, mod.printv does work because we are now using a direct reference to the global in mod instead of a reference that points to printv in mod.
This was a random stab because I think I know the answer based on some random reading I did awhile ago. If I'm incorrect, please let me know and I'll remove this to avoid confusion.

Order of evaluation of classes?

In a file lib.py I defined a functional class C and an enumeration class E as follows:
class C:
a = None
def meth(self, v):
if v == E.v1:
print("In C.meth().v1")
a = E.v1
if v == E.v2:
print("In C.meth().v2")
a = E.v2
from enum import Enum
class E(Enum):
print("In Enum")
v1 = 1
v2 = 2
Then, I import the two classes into my module main.py and use the enumeration:
from lib import C
from lib import E
c = C()
c.meth(E.v1)
When running, I get the following output:
In Enum
In C.meth().v1
Now, since Python is an interpreted language (at least, when using IDLE), I'd expect to get an error on the reference to the enumerations in the method meth. Since there is no error, and it seems to run OK, I wonder what are the (ordering) rules for referencing classes in the same module, and in between different modules? Why is there no error?
Name lookup happens at run time. So when you are defining class C and its method meth, then the lookup on E isn’t done yet. So it’s not a problem that you define it afterwards. Instead, the lookup happens when you call the method.
Also, name lookup happens by going up the scope, so meth will find the original E declared on module level, regardless of whether you import it in your main.py or not. Since you also import E in main.py, which is a reference to the same object, you can reference the same enum value in there too.
See also this example:
>>> def test(): # foo is not defined at this time
print(foo)
>>> test()
NameError: global name 'foo' is not defined
>>> foo = 'bar' # after defining foo, it works:
>>> test()
bar
When defining methods, variables are never “embedded”; the methods only contain the names and those names are looked up at run-time. However, due to how Python does the lookup, names of local variables are always “around” even if they haven’t been initialized yet. This can result in UnboundLocalErrors:
>>> def test():
print(foo)
foo = 'baz'
>>> test()
UnboundLocalError: local variable 'foo' referenced before assignment
One might expect that foo would be looked up in the outer scope for the first print, but because there is a local foo (even if it wasn’t initialized yet), foo will always* resolve to the local foo.
(* The nonlocal statement allows to make foo non-local, resolving it to the outer scope—again for all uses of foo in that method.)
When a module is imported, the commands are executed from top to bottom. Inside a class-definition, the commands are also executed, to define the methods inside the class. A def defines a method, but the commands inside the def are not executed, but only parsed.
The simplest way to understand the order of evaluation in your code is is to watch it execute:
http://dbgr.cc/q
Press the play button on the far right of the debug buttons and it will automatically step through.
I think what is confusing to you is that when class E is defined, all statements inside of the E class are run. This is the case for every class definition. This includes calling the print function to say "In Enum", as well as defining the v1 and v2 members of the E class.
The line c.meth(E.v1) isn't executed until both the C and the E classes have been defined, which means that E.v1 has also already been defined. This is why there is no error like you were expecting.

How to change a module variable from another module?

Suppose I have a package named bar, and it contains bar.py:
a = None
def foobar():
print a
and __init__.py:
from bar import a, foobar
Then I execute this script:
import bar
print bar.a
bar.a = 1
print bar.a
bar.foobar()
Here's what I expect:
None
1
1
Here's what I get:
None
1
None
Can anyone explain my misconception?
You are using from bar import a. a becomes a symbol in the global scope of the importing module (or whatever scope the import statement occurs in).
When you assign a new value to a, you are just changing which value a points too, not the actual value. Try to import bar.py directly with import bar in __init__.py and conduct your experiment there by setting bar.a = 1. This way, you will actually be modifying bar.__dict__['a'] which is the 'real' value of a in this context.
It's a little convoluted with three layers but bar.a = 1 changes the value of a in the module called bar that is actually derived from __init__.py. It does not change the value of a that foobar sees because foobar lives in the actual file bar.py. You could set bar.bar.a if you wanted to change that.
This is one of the dangers of using the from foo import bar form of the import statement: it splits bar into two symbols, one visible globally from within foo which starts off pointing to the original value and a different symbol visible in the scope where the import statement is executed. Changing a where a symbol points doesn't change the value that it pointed too.
This sort of stuff is a killer when trying to reload a module from the interactive interpreter.
One source of difficulty with this question is that you have a program named bar/bar.py: import bar imports either bar/__init__.py or bar/bar.py, depending on where it is done, which makes it a little cumbersome to track which a is bar.a.
Here is how it works:
The key to understanding what happens is to realize that in your __init__.py,
from bar import a
in effect does something like
a = bar.a
# … where bar = bar/bar.py (as if bar were imported locally from __init__.py)
and defines a new variable (bar/__init__.py:a, if you wish). Thus, your from bar import a in __init__.py binds name bar/__init__.py:a to the original bar.py:a object (None). This is why you can do from bar import a as a2 in __init__.py: in this case, it is clear that you have both bar/bar.py:a and a distinct variable name bar/__init__.py:a2 (in your case, the names of the two variables just happen to both be a, but they still live in different namespaces: in __init__.py, they are bar.a and a).
Now, when you do
import bar
print bar.a
you are accessing variable bar/__init__.py:a (since import bar imports your bar/__init__.py). This is the variable you modify (to 1). You are not touching the contents of variable bar/bar.py:a. So when you subsequently do
bar.foobar()
you call bar/bar.py:foobar(), which accesses variable a from bar/bar.py, which is still None (when foobar() is defined, it binds variable names once and for all, so the a in bar.py is bar.py:a, not any other a variable defined in another module—as there might be many a variables in all the imported modules). Hence the last None output.
Conclusion: it is best to avoid any ambiguity in import bar, by not having any bar/bar.py module (since bar.__init__.py makes directory bar/ a package already, that you can also import with import bar).
To put another way:
Turns out this misconception is very easy to make.
It is sneakily defined in the Python language reference: the use of object instead of symbol. I would suggest that the Python language reference make this more clear and less sparse..
The from form does not bind the module name: it goes through the
list of identifiers, looks each one of them up in the module found in
step (1), and binds the name in the local namespace to the object thus
found.
HOWEVER:
When you import, you import the current value of the imported symbol and add it to your namespace as defined. You are not importing a reference, you are effectively importing a value.
Thus, to get the updated value of i, you must import a variable that holds a reference to that symbol.
In other words, importing is NOT like an import in JAVA, external declaration in C/C++ or even a use clause in PERL.
Rather, the following statement in Python:
from some_other_module import a as x
is more like the following code in K&R C:
extern int a; /* import from the EXTERN file */
int x = a;
(caveat: in the Python case, "a" and "x" are essentially a reference to the actual value: you're not copying the INT, you're copying the reference address)

Categories

Resources