I'm passing the value i to th = threading.Thread(target=func, args=(i,)) and start the thread immediately by th.start().
Because this is executed inside a loop with changing index i, I'm wondering if i inside the thread retaines its value from the time the thread has been created, or if the thread is working on the reference of i. In the latter case the value wouldn't necessarily be the sames as it was at creation of th.
Are values passed by reference or value?
I would say, passing mutable objects to functions is calling by reference.
You are not alone in wanting to say that, but thinking that way limits your ability to communicate with others about how Python programs actually work.
Consider the following Python fragment:
a = [1, 2, 3]
b = a
foobar(a)
if not b is a:
print('Impossible!')
If Python was a "pass-by-reference" programming language, then the foobar(a) call could cause this program to print Impossible! by changing local variable a to refer to some different object.
But, Python does not do pass-by-reference. It only does pass-by-value, and that means there is no definition of foobar that could make the fragment execute the print call. The foobar function can mutate the object to which local variable a refers, but it has no ability to modify a itself.
See https://en.wikipedia.org/wiki/Evaluation_strategy
Pass by value and pass by reference are two terms that can be misleading sometimes, and they don't always mean the same in every language. I'm going to assume we're taking about what the two terms mean in C (where a reference is passing a pointer to the variable).
Python is really neither of those, I'll give you an example I took from an article (all credit to the original writer, I'll link the article at the end)
def spam(eggs):
eggs.append(1)
eggs = [2, 3]
ham = [0]
spam(ham)
print(ham)
When spam is called, both ham and eggs point to the same value ([0]), to the same object. So, when eggs.append (1) is executed, [0] becomes [0, 1]. That sounds like pass by reference.
However, when eggs = [2, 3], now both eggs and ham should become the new list in pass by reference. But that does not happen; now eggs points to a list in memory containing [2, 3], but ham still points to the original list with the 1 appended to it. That bit sound more like pass by value.
EDIT
As explained above, if a parameter of the thread is modified inside it, the changes will be seen in the original thread as long as the parameter is mutable. That way, passing a list to the thread and appending something to it will be reflected in the caller thread, for example.
However, an immutable object can't be modified. If you do i += 1, you didn't modify the integer, integers are immutable in Python. You are assigning to i a new integer with a value one unit higher than the one before. It's the same thing that happened with eggs = [2, 3]. So, in that particular example, changes will not be reflected in the original thread.
Hope this helps!
Here's the article I promised, it has a much better explanation of the matter. http://stupidpythonideas.blogspot.com/2013/11/does-python-pass-by-value-or-by.html?m=1
Related
This question already has answers here:
What is the scope of a defaulted parameter in Python?
(7 answers)
Closed 1 year ago.
I understand that one should not use mutable default parameter value in Python (with some exceptions) because this value is evaluated and stored only once when the function is defined, and not each time the function is later called.
My understanding of that is this (using the example below; please excuse my imprecise language as I'm just a beginner in Python programming, who's stuck at the Function chapter of my textbook because of this):
def f(x = [1, 2, 3]):
x.append(4)
print(x)
f()
f()
1) The function f is defined, and x (local variable in f) assumes the default variable of [1, 2, 3] (even before the function is called)
2) When f() is called, x is still [1, 2, 3] due to no argument passed to it, and x continues having its default value
3) x is modified in place with append, becomes [1, 2, 3, 4], and is printed as such
However, this is where my confusion arises. I'd assume that:
4) When f ends, x is destroyed (in the stack or whatever you'd call it) and is no longer associated with the list object [1, 2, 3, 4]**
5) The list object [1, 2, 3, 4] is reclaimed since there's no variable that refers to it anymore
Therefore,
6) When f() is called the second time, I'd expect Python to output an error since x now no longer has a value associated with it. In other words, how can Python reuse the default value from the last evaluation when it's been reclaimed/destroyed?
Appreciate all your help and explanation!
** this understanding I got from Ned Batchelder's page on variable name assignment (see below)
While it may seems to you that at the end of the execution x, the default value, is disposed, it is not.
In fact, Python has a global namespace with all the names available for you to use (built-in functions, classes and functions you import or define).
The content of this namespace is made of objects. Function are objects too.
As a test, if you try this in a script or in the python command line, you will see what I mean:
def f(x = [1, 2, 3]):
x.append(4)
print(x)
print dir(f)
you will see the object nature of the function f. As an objects, the default values are referenced in an attribute, f.func_defaults, therefore they are always available and if mutable they retain the changes, giving you side effects with you may not want.
EDIT: in python 3 the attribute has been replaced by f.__defaults__
There are two references to the list in your case, one is store in the background of the function as the default value to the argument x.
When the function is called without x, a new reference to the same list is created as the local variable x. Then you append to the list via the second reference. And after the call, the second reference is garbage collected. The first reference still points to the same list, which has one element more now.
Or in short: there is only one list all the time.
I am new to Python from R. I have recently spent a lot of time reading up on how everything in Python is an object, objects can call methods on themselves, methods are functions within a class, yada yada yada.
Here's what I don't understand. Take the following simple code:
mylist = [3, 1, 7]
If I want to know how many times the number 7 occurs, I can do:
mylist.count(7)
That, of course, returns 1. And if I want to save the count number to another variable:
seven_counts = mylist.count(7)
So far, so good. Other than the syntax, the behavior is similar to R. However, let's say I am thinking about adding a number to my list:
mylist.append(9)
Wait a minute, that method actually changed the variable itself! (i.e., "mylist" has been altered and now includes the number 9 as the fourth digit in the list.) Assigning the code to a new variable (like I did with seven_counts) produces garbage:
newlist = mylist.append(9)
I find the inconsistency in this behavior a bit odd, and frankly undesirable. (Let's say I wanted to see what the result of the append looked like first and then have the option to decide whether or not I want to assign it to a new variable.)
My question is simple:
Is there a way to know in advance if calling a particular method will actually alter your variable (object)?
Aside from reading the documentation (which for some methods will include type annotations specifying the return value) or playing with the method in the interactive interpreter (including using help() to check the docstring for a type annotation), no, you can't know up front just by looking at the method.
That said, the behavior you're seeing is intentional. Python methods either return a new modified copy of the object or modify the object in place; at least among built-ins, they never do both (some methods mutate the object and return a non-None value, but it's never the object just mutated; the pop method of dict and list is an example of this case).
This either/or behavior is intentional; if they didn't obey this rule, you'd have had an even more confusing and hard to identify problem, namely, determining whether append mutated the value it was called on, or returned a new object. You definitely got back a list, but is it a new list or the same list? If it mutated the value it was called on, then
newlist = mylist.append(9)
is a little strange; newlist and mylist would be aliases to the same list (so why have both names?). You might not even notice for a while; you'd continue using newlist, thinking it was independent of mylist, only to look at mylist and discover it was all messed up. By having all such "modify in place" methods return None (or at least, not the original object), the error is discovered more quickly/easily; if you try and use newlist, mistakenly believing it to be a list, you'll immediately get TypeErrors or AttributeErrors.
Basically, the only way to know in advance is to read the documentation. For methods whose name indicates a modifying operation, you can check the return value and often get an idea as to whether they're mutating. It helps to know what types are mutable in the first place; list, dict, set and bytearray are all mutable, and the methods they have that their immutable counterparts (aside from dict, which has no immutable counterpart) lack tend to mutate the object in place.
The default tends to be to mutate the object in place simply because that's more efficient; if you have a 100,000 element list, a default behavior for append that made a new 100,001 element list and returned it would be extremely inefficient (and there would be no obvious way to avoid it). For immutable types (e.g. str, tuple, frozenset) this is unavoidable, and you can use those types if you want a guarantee that the object is never mutate in place, but it comes at a cost of unnecessary creation and destruction of objects that will slow down your code in most cases.
Just checkout the doc:
>>> list.count.__doc__
'L.count(value) -> integer -- return number of occurrences of value'
>>> list.append.__doc__
'L.append(object) -> None -- append object to end'
There isn't really an easy way to tell, but:
immutable object --> no way of changing through method calls
So, for example, tuple has no methods which affect the tuple as it is unchangeable so methods can only return new instances.
And if you "wanted to see what the result of the append looked like first and then have the option to decide whether or not I want to assign it to a new variable" then you can concatenate the list with a new list with one element.
i.e.
>>> l = [1,2,3]
>>> k = l + [4]
>>> l
[1, 2, 3]
>>> k
[1, 2, 3, 4]
Not from merely your invocation (your method call). You can guarantee that the method won't change the object if you pass in only immutable objects, but some methods are defined to change the object -- and will either not be defined for the one you use, or will fault in execution.
I Real Life, you look at the method's documentation: that will tell you exactly what happens.
[I was about to include what Joe Iddon's answer covers ...]
This question already has answers here:
What is the scope of a defaulted parameter in Python?
(7 answers)
Closed 1 year ago.
I understand that one should not use mutable default parameter value in Python (with some exceptions) because this value is evaluated and stored only once when the function is defined, and not each time the function is later called.
My understanding of that is this (using the example below; please excuse my imprecise language as I'm just a beginner in Python programming, who's stuck at the Function chapter of my textbook because of this):
def f(x = [1, 2, 3]):
x.append(4)
print(x)
f()
f()
1) The function f is defined, and x (local variable in f) assumes the default variable of [1, 2, 3] (even before the function is called)
2) When f() is called, x is still [1, 2, 3] due to no argument passed to it, and x continues having its default value
3) x is modified in place with append, becomes [1, 2, 3, 4], and is printed as such
However, this is where my confusion arises. I'd assume that:
4) When f ends, x is destroyed (in the stack or whatever you'd call it) and is no longer associated with the list object [1, 2, 3, 4]**
5) The list object [1, 2, 3, 4] is reclaimed since there's no variable that refers to it anymore
Therefore,
6) When f() is called the second time, I'd expect Python to output an error since x now no longer has a value associated with it. In other words, how can Python reuse the default value from the last evaluation when it's been reclaimed/destroyed?
Appreciate all your help and explanation!
** this understanding I got from Ned Batchelder's page on variable name assignment (see below)
While it may seems to you that at the end of the execution x, the default value, is disposed, it is not.
In fact, Python has a global namespace with all the names available for you to use (built-in functions, classes and functions you import or define).
The content of this namespace is made of objects. Function are objects too.
As a test, if you try this in a script or in the python command line, you will see what I mean:
def f(x = [1, 2, 3]):
x.append(4)
print(x)
print dir(f)
you will see the object nature of the function f. As an objects, the default values are referenced in an attribute, f.func_defaults, therefore they are always available and if mutable they retain the changes, giving you side effects with you may not want.
EDIT: in python 3 the attribute has been replaced by f.__defaults__
There are two references to the list in your case, one is store in the background of the function as the default value to the argument x.
When the function is called without x, a new reference to the same list is created as the local variable x. Then you append to the list via the second reference. And after the call, the second reference is garbage collected. The first reference still points to the same list, which has one element more now.
Or in short: there is only one list all the time.
ie we have the global declaration, but no local.
"Normally" arguments are local, I think, or they certainly behave that way.
However if an argument is, say, a list and a method is applied which modifies the list, some surprising (to me) results can ensue.
I have 2 questions: what is the proper way to ensure that a variable is truly local?
I wound up using the following, which works, but it can hardly be the proper way of doing it:
def AexclB(a,b):
z = a+[] # yuk
for k in range(0, len(b)):
try: z.remove(b[k])
except: continue
return z
Absent the +[], "a" in the calling scope gets modified, which is not desired.
(The issue here is using a list method,
The supplementary question is, why is there no "local" declaration?
Finally, in trying to pin this down, I made various mickey mouse functions which all behaved as expected except the last one:
def fun4(a):
z = a
z = z.append(["!!"])
return z
a = ["hello"]
print "a=",a
print "fun4(a)=",fun4(a)
print "a=",a
which produced the following on the console:
a= ['hello']
fun4(a)= None
a= ['hello', ['!!']]
...
>>>
The 'None' result was not expected (by me).
Python 2.7 btw in case that matters.
PS: I've tried searching here and elsewhere but not succeeded in finding anything corresponding exactly - there's lots about making variables global, sadly.
It's not that z isn't a local variable in your function. Rather when you have the line z = a, you are making z refer to the same list in memory that a already points to. If you want z to be a copy of a, then you should write z = a[:] or z = list(a).
See this link for some illustrations and a bit more explanation http://henry.precheur.org/python/copy_list
Python will not copy objects unless you explicitly ask it to. Integers and strings are not modifiable, so every operation on them returns a new instance of the type. Lists, dictionaries, and basically every other object in Python are mutable, so operations like list.append happen in-place (and therefore return None).
If you want the variable to be a copy, you must explicitly copy it. In the case of lists, you slice them:
z = a[:]
There is a great answer than will cover most of your question in here which explains mutable and immutable types and how they are kept in memory and how they are referenced. First section of the answer is for you. (Before How do we get around this? header)
In the following line
z = z.append(["!!"])
Lists are mutable objects, so when you call append, it will update referenced object, it will not create a new one and return it. If a method or function do not retun anything, it means it returns None.
Above link also gives an immutable examle so you can see the real difference.
You can not make a mutable object act like it is immutable. But you can create a new one instead of passing the reference when you create a new object from an existing mutable one.
a = [1,2,3]
b = a[:]
For more options you can check here
What you're missing is that all variable assignment in python is by reference (or by pointer, if you like). Passing arguments to a function literally assigns values from the caller to the arguments of the function, by reference. If you dig into the reference, and change something inside it, the caller will see that change.
If you want to ensure that callers will not have their values changed, you can either try to use immutable values more often (tuple, frozenset, str, int, bool, NoneType), or be certain to take copies of your data before mutating it in place.
In summary, scoping isn't involved in your problem here. Mutability is.
Is that clear now?
Still not sure whats the 'correct' way to force the copy, there are
various suggestions here.
It differs by data type, but generally <type>(obj) will do the trick. For example list([1, 2]) and dict({1:2}) both return (shallow!) copies of their argument.
If, however, you have a tree of mutable objects and also you don't know a-priori which level of the tree you might modify, you need the copy module. That said, I've only needed this a handful of times (in 8 years of full-time python), and most of those ended up causing bugs. If you need this, it's a code smell, in my opinion.
The complexity of maintaining copies of mutable objects is the reason why there is a growing trend of using immutable objects by default. In the clojure language, all data types are immutable by default and mutability is treated as a special cases to be minimized.
If you need to work on a list or other object in a truly local context you need to explicitly make a copy or a deep copy of it.
from copy import copy
def fn(x):
y = copy(x)
There is a lot of confusion with python names in the web and documentation doesn't seem to be that clear about names. Below are several things I read about python names.
names are references to objects (where are they? heap?) and what name holds is an address. (like Java).
names in python are like C++ references ( int& b) which means that it is another alias for a memory location; i.e. for int a , a is a memory location. if int& b = a means that b is another name the for same memory location
names are very similar to automatically dereferenced pointers variables in C.
Which of the above statements is/are correct?
Does Python names contain some kind of address in them or is it just a name to a memory location (like C++ & references)?
Where are python names stored, Stack or heap?
EDIT:
Check out the below lines from http://etutorials.org/Programming/Python.+Text+processing/Appendix+A.+A+Selective+and+Impressionistic+Short+Review+of+Python/A.2+Namespaces+and+Bindings/#
Whenever a (possibly qualified) name occurs on the right side of an assignment, or on a line by itself, the name is dereferenced to the object itself. If a name has not been bound inside some accessible scope, it cannot be dereferenced; attempting to do so raises a NameError exception. If the name is followed by left and right parentheses (possibly with comma-separated expressions between them), the object is invoked/called after it is dereferenced. Exactly what happens upon invocation can be controlled and overridden for Python objects; but in general, invoking a function or method runs some code, and invoking a class creates an instance. For example:
pkg.subpkg.func() # invoke a function from a namespace
x = y # deref 'y' and bind same object to 'x'
This makes sense.Just want to cross check how true it is.Comments and answers please
names are references to objects
Yes. You shouldn't care where the objects live if you just want to understand Python variables' semantics; they're somewhere in memory and Python implementations manage memory for you. How they do that depends on the implementation (CPython, Jython, PyPy...).
names in python are like C++ references
Not exactly. Reassigning a reference in C++ actually reassigns the memory location referenced, e.g. after
int i = 0;
int &r = i;
r = 1;
it is true that i == 1. You can't do this in Python except by using a mutable container object. The closest you can get to the C++ reference behavior is
i = [0] # single-element list
r = i # r is now another reference to the object referenced by i
r[0] = 1 # sets i[0]
are very similar to automatically dereferenced pointers variables in C
No, because then they'd be similar to C++ references in the above regard.
Does Python names contain some kind of address in them or is it just a name to a memory location?
The former is closer to the truth, assuming a straightforward implementation (again, PyPy might do things differently than CPython). In any case, Python variables are not storage locations, but rather labels/names for objects that may live anywhere in memory.
Every object in a Python process has an identity that can be obtained using the id function, which in CPython returns its memory address. You can check whether two variables (or expressions more generally) reference the same object by checking their id, or more directly by using is:
>>> i = [1, 2]
>>> j = i # a new reference
>>> i is j # same identity?
True
>>> j = [1, 2] # a new list
>>> i == j # same value?
True
>>> i is j # same identity?
False
Python names are, well, names. You have objects and names, that's it.
Creating an object, say [3, 4, 5] creates an object somewhere on the heap. You don't need to know how. Now you can put names to target this object, by assigning it to names:
x = [3, 4, 5]
That is, the assignment operator assigns names rather than values. x isn't [3, 4, 5], no, it's simply a name pointing to the [3, 4, 5] object. So doing this:
x = 1
Doesn't change the original [3, 4, 5] object, instead it assigns the object 1 to the name x. Also note that most expressions like [3, 4, 5], but also 8 + 3 create temporaries. Unless you assign a name to that temporary it will immediately die. There is no (except, for example in CPython for small numbers, but that aside) mechanism to keep objects alive that aren't referenced, and cache them. For example, this fails:
>>> x = [3, 4, 5]
>>> x is [3, 4, 5] # must be some object, right? no!
False
However, that's merely assignment (which is not overloadable in Python). In fact, objects and names in Python very well behave like automatically dereferencing pointers, except that they are automatically reference counted and die after they're not referenced anymore (in CPython, at least) and that they do not automatically dereference on assignment.
Thanks to this memory model, for example the C++ way of overloading index operations doesn't work. Instead, Python uses __setitem__ and __getitem__ because you can't return anything that's "assignable". Furthermore, the operators +=, *=, etc... work by creating temporaries and assigning that temporary back to the name.
Python objects are stored on a heap and are garbage collected via reference counting.
Variables are references to objects like in Java, and thus point 1 applies. I am not familiar with either C++ or automatically dereferenced pointer variables in C, to make a call on those.
Ultimately, it's the python interpreter that does the looking up of items in the interpreter structures, which usually are python lists and dictionaries and other such abstract containers; namespaces use dict (a hash table) for example, where the names and values are pointers to other python objects. These are managed explicitly by the mapping protocol.
To the python programmer, this is all hidden; you don't need to know where your objects live, just that they are still alive as long as you have something referencing them. You pass around these references when coding in python.