Related
I'm trying to understand Python's approach to variable scope. In this example, why is f() able to alter the value of x, as perceived within main(), but not the value of n?
def f(n, x):
n = 2
x.append(4)
print('In f():', n, x)
def main():
n = 1
x = [0,1,2,3]
print('Before:', n, x)
f(n, x)
print('After: ', n, x)
main()
Output:
Before: 1 [0, 1, 2, 3]
In f(): 2 [0, 1, 2, 3, 4]
After: 1 [0, 1, 2, 3, 4]
See also: How do I pass a variable by reference?
Some answers contain the word "copy" in the context of a function call. I find it confusing.
Python doesn't copy objects you pass during a function call ever.
Function parameters are names. When you call a function, Python binds these parameters to whatever objects you pass (via names in a caller scope).
Objects can be mutable (like lists) or immutable (like integers and strings in Python). A mutable object you can change. You can't change a name, you just can bind it to another object.
Your example is not about scopes or namespaces, it is about naming and binding and mutability of an object in Python.
def f(n, x): # these `n`, `x` have nothing to do with `n` and `x` from main()
n = 2 # put `n` label on `2` balloon
x.append(4) # call `append` method of whatever object `x` is referring to.
print('In f():', n, x)
x = [] # put `x` label on `[]` ballon
# x = [] has no effect on the original list that is passed into the function
Here are nice pictures on the difference between variables in other languages and names in Python.
You've got a number of answers already, and I broadly agree with J.F. Sebastian, but you might find this useful as a shortcut:
Any time you see varname =, you're creating a new name binding within the function's scope. Whatever value varname was bound to before is lost within this scope.
Any time you see varname.foo() you're calling a method on varname. The method may alter varname (e.g. list.append). varname (or, rather, the object that varname names) may exist in more than one scope, and since it's the same object, any changes will be visible in all scopes.
[note that the global keyword creates an exception to the first case]
f doesn't actually alter the value of x (which is always the same reference to an instance of a list). Rather, it alters the contents of this list.
In both cases, a copy of a reference is passed to the function. Inside the function,
n gets assigned a new value. Only the reference inside the function is modified, not the one outside it.
x does not get assigned a new value: neither the reference inside nor outside the function are modified. Instead, x’s value is modified.
Since both the x inside the function and outside it refer to the same value, both see the modification. By contrast, the n inside the function and outside it refer to different values after n was reassigned inside the function.
I will rename variables to reduce confusion. n -> nf or nmain. x -> xf or xmain:
def f(nf, xf):
nf = 2
xf.append(4)
print 'In f():', nf, xf
def main():
nmain = 1
xmain = [0,1,2,3]
print 'Before:', nmain, xmain
f(nmain, xmain)
print 'After: ', nmain, xmain
main()
When you call the function f, the Python runtime makes a copy of xmain and assigns it to xf, and similarly assigns a copy of nmain to nf.
In the case of n, the value that is copied is 1.
In the case of x the value that is copied is not the literal list [0, 1, 2, 3]. It is a reference to that list. xf and xmain are pointing at the same list, so when you modify xf you are also modifying xmain.
If, however, you were to write something like:
xf = ["foo", "bar"]
xf.append(4)
you would find that xmain has not changed. This is because, in the line xf = ["foo", "bar"] you have change xf to point to a new list. Any changes you make to this new list will have no effects on the list that xmain still points to.
Hope that helps. :-)
If the functions are re-written with completely different variables and we call id on them, it then illustrates the point well. I didn't get this at first and read jfs' post with the great explanation, so I tried to understand/convince myself:
def f(y, z):
y = 2
z.append(4)
print ('In f(): ', id(y), id(z))
def main():
n = 1
x = [0,1,2,3]
print ('Before in main:', n, x,id(n),id(x))
f(n, x)
print ('After in main:', n, x,id(n),id(x))
main()
Before in main: 1 [0, 1, 2, 3] 94635800628352 139808499830024
In f(): 94635800628384 139808499830024
After in main: 1 [0, 1, 2, 3, 4] 94635800628352 139808499830024
z and x have the same id. Just different tags for the same underlying structure as the article says.
My general understanding is that any object variable (such as a list or a dict, among others) can be modified through its functions. What I believe you are not able to do is reassign the parameter - i.e., assign it by reference within a callable function.
That is consistent with many other languages.
Run the following short script to see how it works:
def func1(x, l1):
x = 5
l1.append("nonsense")
y = 10
list1 = ["meaning"]
func1(y, list1)
print(y)
print(list1)
It´s because a list is a mutable object. You´re not setting x to the value of [0,1,2,3], you´re defining a label to the object [0,1,2,3].
You should declare your function f() like this:
def f(n, x=None):
if x is None:
x = []
...
n is an int (immutable), and a copy is passed to the function, so in the function you are changing the copy.
X is a list (mutable), and a copy of the pointer is passed o the function so x.append(4) changes the contents of the list. However, you you said x = [0,1,2,3,4] in your function, you would not change the contents of x in main().
Python is copy by value of reference. An object occupies a field in memory, and a reference is associated with that object, but itself occupies a field in memory. And name/value is associated with a reference. In python function, it always copy the value of the reference, so in your code, n is copied to be a new name, when you assign that, it has a new space in caller stack. But for the list, the name also got copied, but it refer to the same memory(since you never assign the list a new value). That is a magic in python!
When you are passing the command n = 2 inside the function, it finds a memory space and label it as 2. But if you call the method append, you are basically refrencing to location x (whatever the value is) and do some operation on that.
Python is a pure pass-by-value language if you think about it the right way. A python variable stores the location of an object in memory. The Python variable does not store the object itself. When you pass a variable to a function, you are passing a copy of the address of the object being pointed to by the variable.
Contrast these two functions
def foo(x):
x[0] = 5
def goo(x):
x = []
Now, when you type into the shell
>>> cow = [3,4,5]
>>> foo(cow)
>>> cow
[5,4,5]
Compare this to goo.
>>> cow = [3,4,5]
>>> goo(cow)
>>> goo
[3,4,5]
In the first case, we pass a copy the address of cow to foo and foo modified the state of the object residing there. The object gets modified.
In the second case you pass a copy of the address of cow to goo. Then goo proceeds to change that copy. Effect: none.
I call this the pink house principle. If you make a copy of your address and tell a
painter to paint the house at that address pink, you will wind up with a pink house.
If you give the painter a copy of your address and tell him to change it to a new address,
the address of your house does not change.
The explanation eliminates a lot of confusion. Python passes the addresses variables store by value.
As jouell said. It's a matter of what points to what and i'd add that it's also a matter of the difference between what = does and what the .append method does.
When you define n and x in main, you tell them to point at 2 objects, namely 1 and [1,2,3]. That is what = does : it tells what your variable should point to.
When you call the function f(n,x), you tell two new local variables nf and xf to point at the same two objects as n and x.
When you use "something"="anything_new", you change what "something" points to. When you use .append, you change the object itself.
Somehow, even though you gave them the same names, n in the main() and the n in f() are not the same entity, they only originally point to the same object (same goes for x actually). A change to what one of them points to won't affect the other. However, if you instead make a change to the object itself, that will affect both variables as they both point to this same, now modified, object.
Lets illustrate the difference between the method .append and the = without defining a new function :
compare
m = [1,2,3]
n = m # this tells n to point at the same object as m does at the moment
m = [1,2,3,4] # writing m = m + [4] would also do the same
print('n = ', n,'m = ',m)
to
m = [1,2,3]
n = m
m.append(4)
print('n = ', n,'m = ',m)
In the first code, it will print n = [1, 2, 3] m = [1, 2, 3, 4], since in the 3rd line, you didnt change the object [1,2,3], but rather you told m to point to a new, different, object (using '='), while n still pointed at the original object.
In the second code, it will print n = [1, 2, 3, 4] m = [1, 2, 3, 4]. This is because here both m and n still point to the same object throughout the code, but you modified the object itself (that m is pointing to) using the .append method... Note that the result of the second code will be the same regardless of wether you write m.append(4) or n.append(4) on the 3rd line.
Once you understand that, the only confusion that remains is really to understand that, as I said, the n and x inside your f() function and the ones in your main() are NOT the same, they only initially point to the same object when you call f().
Please allow me to edit again. These concepts are my experience from learning python by try error and internet, mostly stackoverflow. There are mistakes and there are helps.
Python variables use references, I think reference as relation links from name, memory adress and value.
When we do B = A, we actually create a nickname of A, and now the A has 2 names, A and B. When we call B, we actually are calling the A. we create a ink to the value of other variable, instead of create a new same value, this is what we call reference. And this thought would lead to 2 porblems.
when we do
A = [1]
B = A # Now B is an alias of A
A.append(2) # Now the value of A had been changes
print(B)
>>> [1, 2]
# B is still an alias of A
# Which means when we call B, the real name we are calling is A
# When we do something to B, the real name of our object is A
B.append(3)
print(A)
>>> [1, 2, 3]
This is what happens when we pass arguments to functions
def test(B):
print('My name is B')
print(f'My value is {B}')
print(' I am just a nickname, My real name is A')
B.append(2)
A = [1]
test(A)
print(A)
>>> [1, 2]
We pass A as an argument of a function, but the name of this argument in that function is B.
Same one with different names.
So when we do B.append, we are doing A.append
When we pass an argument to a function, we are not passing a variable , we are passing an alias.
And here comes the 2 problems.
the equal sign always creates a new name
A = [1]
B = A
B.append(2)
A = A[0] # Now the A is a brand new name, and has nothing todo with the old A from now on.
B.append(3)
print(A)
>>> 1
# the relation of A and B is removed when we assign the name A to something else
# Now B is a independent variable of hisown.
the Equal sign is a statesment of clear brand new name,
this was the concused part of mine
A = [1, 2, 3]
# No equal sign, we are working on the origial object,
A.append(4)
>>> [1, 2, 3, 4]
# This would create a new A
A = A + [4]
>>> [1, 2, 3, 4]
and the function
def test(B):
B = [1, 2, 3] # B is a new name now, not an alias of A anymore
B.append(4) # so this operation won't effect A
A = [1, 2, 3]
test(A)
print(A)
>>> [1, 2, 3]
# ---------------------------
def test(B):
B.append(4) # B is a nickname of A, we are doing A
A = [1, 2, 3]
test(A)
print(A)
>>> [1, 2, 3, 4]
the first problem is
the left side of and equation is always a brand new name, new variable,
unless the right side is a name, like B = A, this create an alias only
The second problem, there are something would never be changed, we cannot modify the original, can only create a new one.
This is what we call immutable.
When we do A= 123 , we create a dict which contains name, value, and adress.
When we do B = A, we copy the adress and value from A to B, all operation to B effect the same adress of the value of A.
When it comes to string, numbers, and tuple. the pair of value and adress could never be change. When we put a str to some adress, it was locked right away, the result of all modifications would be put into other adress.
A = 'string' would create a protected value and adess to storage the string 'string' . Currently, there is no built-in functions or method cound modify a string with the syntax like list.append, because this code modify the original value of a adress.
the value and adress of a string, a number, or a tuple is protected, locked, immutable.
All we can work on a string is by the syntax of A = B.method , we have to create a new name to storage the new string value.
please extend this discussion if you still get confused.
this discussion help me to figure out mutable / immutable / refetence / argument / variable / name once for all, hopely this could do some help to someone too.
##############################
had modified my answer tons of times and realized i don't have to say anything, python had explained itself already.
a = 'string'
a.replace('t', '_')
print(a)
>>> 'string'
a = a.replace('t', '_')
print(a)
>>> 's_ring'
b = 100
b + 1
print(b)
>>> 100
b = b + 1
print(b)
>>> 101
def test_id(arg):
c = id(arg)
arg = 123
d = id(arg)
return
a = 'test ids'
b = id(a)
test_id(a)
e = id(a)
# b = c = e != d
# this function do change original value
del change_like_mutable(arg):
arg.append(1)
arg.insert(0, 9)
arg.remove(2)
return
test_1 = [1, 2, 3]
change_like_mutable(test_1)
# this function doesn't
def wont_change_like_str(arg):
arg = [1, 2, 3]
return
test_2 = [1, 1, 1]
wont_change_like_str(test_2)
print("Doesn't change like a imutable", test_2)
This devil is not the reference / value / mutable or not / instance, name space or variable / list or str, IT IS THE SYNTAX, EQUAL SIGN.
I know Python doesn't have pointers, but is there a way to have this yield 2 instead
>>> a = 1
>>> b = a # modify this line somehow so that b "points to" a
>>> a = 2
>>> b
1
?
Here's an example: I want form.data['field'] and form.field.value to always have the same value. It's not completely necessary, but I think it would be nice.
In PHP, for example, I can do this:
<?php
class Form {
public $data = [];
public $fields;
function __construct($fields) {
$this->fields = $fields;
foreach($this->fields as &$field) {
$this->data[$field['id']] = &$field['value'];
}
}
}
$f = new Form([
[
'id' => 'fname',
'value' => 'George'
],
[
'id' => 'lname',
'value' => 'Lucas'
]
]);
echo $f->data['fname'], $f->fields[0]['value']; # George George
$f->data['fname'] = 'Ralph';
echo $f->data['fname'], $f->fields[0]['value']; # Ralph Ralph
Output:
GeorgeGeorgeRalphRalph
ideone
Or like this in C++ (I think this is right, but my C++ is rusty):
#include <iostream>
using namespace std;
int main() {
int* a;
int* b = a;
*a = 1;
cout << *a << endl << *b << endl; # 1 1
return 0;
}
There's no way you can do that changing only that line. You can do:
a = [1]
b = a
a[0] = 2
b[0]
That creates a list, assigns the reference to a, then b also, uses the a reference to set the first element to 2, then accesses using the b reference variable.
I want form.data['field'] and
form.field.value to always have the
same value
This is feasible, because it involves decorated names and indexing -- i.e., completely different constructs from the barenames a and b that you're asking about, and for with your request is utterly impossible. Why ask for something impossible and totally different from the (possible) thing you actually want?!
Maybe you don't realize how drastically different barenames and decorated names are. When you refer to a barename a, you're getting exactly the object a was last bound to in this scope (or an exception if it wasn't bound in this scope) -- this is such a deep and fundamental aspect of Python that it can't possibly be subverted. When you refer to a decorated name x.y, you're asking an object (the object x refers to) to please supply "the y attribute" -- and in response to that request, the object can perform totally arbitrary computations (and indexing is quite similar: it also allows arbitrary computations to be performed in response).
Now, your "actual desiderata" example is mysterious because in each case two levels of indexing or attribute-getting are involved, so the subtlety you crave could be introduced in many ways. What other attributes is form.field suppose to have, for example, besides value? Without that further .value computations, possibilities would include:
class Form(object):
...
def __getattr__(self, name):
return self.data[name]
and
class Form(object):
...
#property
def data(self):
return self.__dict__
The presence of .value suggests picking the first form, plus a kind-of-useless wrapper:
class KouWrap(object):
def __init__(self, value):
self.value = value
class Form(object):
...
def __getattr__(self, name):
return KouWrap(self.data[name])
If assignments such form.field.value = 23 is also supposed to set the entry in form.data, then the wrapper must become more complex indeed, and not all that useless:
class MciWrap(object):
def __init__(self, data, k):
self._data = data
self._k = k
#property
def value(self):
return self._data[self._k]
#value.setter
def value(self, v)
self._data[self._k] = v
class Form(object):
...
def __getattr__(self, name):
return MciWrap(self.data, name)
The latter example is roughly as close as it gets, in Python, to the sense of "a pointer" as you seem to want -- but it's crucial to understand that such subtleties can ever only work with indexing and/or decorated names, never with barenames as you originally asked!
It's not a bug, it's a feature :-)
When you look at the '=' operator in Python, don't think in terms of assignment. You don't assign things, you bind them. = is a binding operator.
So in your code, you are giving the value 1 a name: a. Then, you are giving the value in 'a' a name: b. Then you are binding the value 2 to the name 'a'. The value bound to b doesn't change in this operation.
Coming from C-like languages, this can be confusing, but once you become accustomed to it, you find that it helps you to read and reason about your code more clearly: the value which has the name 'b' will not change unless you explicitly change it. And if you do an 'import this', you'll find that the Zen of Python states that Explicit is better than implicit.
Note as well that functional languages such as Haskell also use this paradigm, with great value in terms of robustness.
Yes! there is a way to use a variable as a pointer in python!
I am sorry to say that many of answers were partially wrong. In principle every equal(=) assignation shares the memory address (check the id(obj) function), but in practice it is not such. There are variables whose equal("=") behaviour works in last term as a copy of memory space, mostly in simple objects (e.g. "int" object), and others in which not (e.g. "list","dict" objects).
Here is an example of pointer assignation
dict1 = {'first':'hello', 'second':'world'}
dict2 = dict1 # pointer assignation mechanism
dict2['first'] = 'bye'
dict1
>>> {'first':'bye', 'second':'world'}
Here is an example of copy assignation
a = 1
b = a # copy of memory mechanism. up to here id(a) == id(b)
b = 2 # new address generation. therefore without pointer behaviour
a
>>> 1
Pointer assignation is a pretty useful tool for aliasing without the waste of extra memory, in certain situations for performing comfy code,
class cls_X():
...
def method_1():
pd1 = self.obj_clsY.dict_vars_for_clsX['meth1'] # pointer dict 1: aliasing
pd1['var4'] = self.method2(pd1['var1'], pd1['var2'], pd1['var3'])
#enddef method_1
...
#endclass cls_X
but one have to be aware of this use in order to prevent code mistakes.
To conclude, by default some variables are barenames (simple objects like int, float, str,...), and some are pointers when assigned between them (e.g. dict1 = dict2). How to recognize them? just try this experiment with them. In IDEs with variable explorer panel usually appears to be the memory address ("#axbbbbbb...") in the definition of pointer-mechanism objects.
I suggest investigate in the topic. There are many people who know much more about this topic for sure. (see "ctypes" module). I hope it is helpful. Enjoy the good use of the objects! Regards, José Crespo
>> id(1)
1923344848 # identity of the location in memory where 1 is stored
>> id(1)
1923344848 # always the same
>> a = 1
>> b = a # or equivalently b = 1, because 1 is immutable
>> id(a)
1923344848
>> id(b) # equal to id(a)
1923344848
As you can see a and b are just two different names that reference to the same immutable object (int) 1. If later you write a = 2, you reassign the name a to a different object (int) 2, but the b continues referencing to 1:
>> id(2)
1923344880
>> a = 2
>> id(a)
1923344880 # equal to id(2)
>> b
1 # b hasn't changed
>> id(b)
1923344848 # equal to id(1)
What would happen if you had a mutable object instead, such as a list [1]?
>> id([1])
328817608
>> id([1])
328664968 # different from the previous id, because each time a new list is created
>> a = [1]
>> id(a)
328817800
>> id(a)
328817800 # now same as before
>> b = a
>> id(b)
328817800 # same as id(a)
Again, we are referencing to the same object (list) [1] by two different names a and b. However now we can mutate this list while it remains the same object, and a, b will both continue referencing to it
>> a[0] = 2
>> a
[2]
>> b
[2]
>> id(a)
328817800 # same as before
>> id(b)
328817800 # same as before
From one point of view, everything is a pointer in Python. Your example works a lot like the C++ code.
int* a = new int(1);
int* b = a;
a = new int(2);
cout << *b << endl; // prints 1
(A closer equivalent would use some type of shared_ptr<Object> instead of int*.)
Here's an example: I want
form.data['field'] and
form.field.value to always have the
same value. It's not completely
necessary, but I think it would be
nice.
You can do this by overloading __getitem__ in form.data's class.
This is a python pointer (different of c/c++)
>>> a = lambda : print('Hello')
>>> a
<function <lambda> at 0x0000018D192B9DC0>
>>> id(a) == int(0x0000018D192B9DC0)
True
>>> from ctypes import cast, py_object
>>> cast(id(a), py_object).value == cast(int(0x0000018D192B9DC0), py_object).value
True
>>> cast(id(a), py_object).value
<function <lambda> at 0x0000018D192B9DC0>
>>> cast(id(a), py_object).value()
Hello
I wrote the following simple class as, effectively, a way to emulate a pointer in python:
class Parameter:
"""Syntactic sugar for getter/setter pair
Usage:
p = Parameter(getter, setter)
Set parameter value:
p(value)
p.val = value
p.set(value)
Retrieve parameter value:
p()
p.val
p.get()
"""
def __init__(self, getter, setter):
"""Create parameter
Required positional parameters:
getter: called with no arguments, retrieves the parameter value.
setter: called with value, sets the parameter.
"""
self._get = getter
self._set = setter
def __call__(self, val=None):
if val is not None:
self._set(val)
return self._get()
def get(self):
return self._get()
def set(self, val):
self._set(val)
#property
def val(self):
return self._get()
#val.setter
def val(self, val):
self._set(val)
Here's an example of use (from a jupyter notebook page):
l1 = list(range(10))
def l1_5_getter(lst=l1, number=5):
return lst[number]
def l1_5_setter(val, lst=l1, number=5):
lst[number] = val
[
l1_5_getter(),
l1_5_setter(12),
l1,
l1_5_getter()
]
Out = [5, None, [0, 1, 2, 3, 4, 12, 6, 7, 8, 9], 12]
p = Parameter(l1_5_getter, l1_5_setter)
print([
p(),
p.get(),
p.val,
p(13),
p(),
p.set(14),
p.get()
])
p.val = 15
print(p.val, l1)
[12, 12, 12, 13, 13, None, 14]
15 [0, 1, 2, 3, 4, 15, 6, 7, 8, 9]
Of course, it is also easy to make this work for dict items or attributes of an object. There is even a way to do what the OP asked for, using globals():
def setter(val, dict=globals(), key='a'):
dict[key] = val
def getter(dict=globals(), key='a'):
return dict[key]
pa = Parameter(getter, setter)
pa(2)
print(a)
pa(3)
print(a)
This will print out 2, followed by 3.
Messing with the global namespace in this way is kind of transparently a terrible idea, but it shows that it is possible (if inadvisable) to do what the OP asked for.
The example is, of course, fairly pointless. But I have found this class to be useful in the application for which I developed it: a mathematical model whose behavior is governed by numerous user-settable mathematical parameters, of diverse types (which, because they depend on command line arguments, are not known at compile time). And once access to something has been encapsulated in a Parameter object, all such objects can be manipulated in a uniform way.
Although it doesn't look much like a C or C++ pointer, this is solving a problem that I would have solved with pointers if I were writing in C++.
The following code emulates exactly the behavior of pointers in C:
from collections import deque # more efficient than list for appending things
pointer_storage = deque()
pointer_address = 0
class new:
def __init__(self):
global pointer_storage
global pointer_address
self.address = pointer_address
self.val = None
pointer_storage.append(self)
pointer_address += 1
def get_pointer(address):
return pointer_storage[address]
def get_address(p):
return p.address
null = new() # create a null pointer, whose address is 0
Here are examples of use:
p = new()
p.val = 'hello'
q = new()
q.val = p
r = new()
r.val = 33
p = get_pointer(3)
print(p.val, flush = True)
p.val = 43
print(get_pointer(3).val, flush = True)
But it's now time to give a more professional code, including the option of deleting pointers, that I've just found in my personal library:
# C pointer emulation:
from collections import deque # more efficient than list for appending things
from sortedcontainers import SortedList #perform add and discard in log(n) times
class new:
# C pointer emulation:
# use as : p = new()
# p.val
# p.val = something
# p.address
# get_address(p)
# del_pointer(p)
# null (a null pointer)
__pointer_storage__ = SortedList(key = lambda p: p.address)
__to_delete_pointers__ = deque()
__pointer_address__ = 0
def __init__(self):
self.val = None
if new.__to_delete_pointers__:
p = new.__to_delete_pointers__.pop()
self.address = p.address
new.__pointer_storage__.discard(p) # performed in log(n) time thanks to sortedcontainers
new.__pointer_storage__.add(self) # idem
else:
self.address = new.__pointer_address__
new.__pointer_storage__.add(self)
new.__pointer_address__ += 1
def get_pointer(address):
return new.__pointer_storage__[address]
def get_address(p):
return p.address
def del_pointer(p):
new.__to_delete_pointers__.append(p)
null = new() # create a null pointer, whose address is 0
I don't know if my comment will help or not but if you want to use pointers in python, you can use dictionaries instead of variables
Let's say in your example will be
a = {'value': 1}
b = {'value': 2}
then you changed a to 2
a['value'] = 2
print(a) #{'value': 2}
I'm trying to understand Python's approach to variable scope. In this example, why is f() able to alter the value of x, as perceived within main(), but not the value of n?
def f(n, x):
n = 2
x.append(4)
print('In f():', n, x)
def main():
n = 1
x = [0,1,2,3]
print('Before:', n, x)
f(n, x)
print('After: ', n, x)
main()
Output:
Before: 1 [0, 1, 2, 3]
In f(): 2 [0, 1, 2, 3, 4]
After: 1 [0, 1, 2, 3, 4]
See also: How do I pass a variable by reference?
Some answers contain the word "copy" in the context of a function call. I find it confusing.
Python doesn't copy objects you pass during a function call ever.
Function parameters are names. When you call a function, Python binds these parameters to whatever objects you pass (via names in a caller scope).
Objects can be mutable (like lists) or immutable (like integers and strings in Python). A mutable object you can change. You can't change a name, you just can bind it to another object.
Your example is not about scopes or namespaces, it is about naming and binding and mutability of an object in Python.
def f(n, x): # these `n`, `x` have nothing to do with `n` and `x` from main()
n = 2 # put `n` label on `2` balloon
x.append(4) # call `append` method of whatever object `x` is referring to.
print('In f():', n, x)
x = [] # put `x` label on `[]` ballon
# x = [] has no effect on the original list that is passed into the function
Here are nice pictures on the difference between variables in other languages and names in Python.
You've got a number of answers already, and I broadly agree with J.F. Sebastian, but you might find this useful as a shortcut:
Any time you see varname =, you're creating a new name binding within the function's scope. Whatever value varname was bound to before is lost within this scope.
Any time you see varname.foo() you're calling a method on varname. The method may alter varname (e.g. list.append). varname (or, rather, the object that varname names) may exist in more than one scope, and since it's the same object, any changes will be visible in all scopes.
[note that the global keyword creates an exception to the first case]
f doesn't actually alter the value of x (which is always the same reference to an instance of a list). Rather, it alters the contents of this list.
In both cases, a copy of a reference is passed to the function. Inside the function,
n gets assigned a new value. Only the reference inside the function is modified, not the one outside it.
x does not get assigned a new value: neither the reference inside nor outside the function are modified. Instead, x’s value is modified.
Since both the x inside the function and outside it refer to the same value, both see the modification. By contrast, the n inside the function and outside it refer to different values after n was reassigned inside the function.
I will rename variables to reduce confusion. n -> nf or nmain. x -> xf or xmain:
def f(nf, xf):
nf = 2
xf.append(4)
print 'In f():', nf, xf
def main():
nmain = 1
xmain = [0,1,2,3]
print 'Before:', nmain, xmain
f(nmain, xmain)
print 'After: ', nmain, xmain
main()
When you call the function f, the Python runtime makes a copy of xmain and assigns it to xf, and similarly assigns a copy of nmain to nf.
In the case of n, the value that is copied is 1.
In the case of x the value that is copied is not the literal list [0, 1, 2, 3]. It is a reference to that list. xf and xmain are pointing at the same list, so when you modify xf you are also modifying xmain.
If, however, you were to write something like:
xf = ["foo", "bar"]
xf.append(4)
you would find that xmain has not changed. This is because, in the line xf = ["foo", "bar"] you have change xf to point to a new list. Any changes you make to this new list will have no effects on the list that xmain still points to.
Hope that helps. :-)
If the functions are re-written with completely different variables and we call id on them, it then illustrates the point well. I didn't get this at first and read jfs' post with the great explanation, so I tried to understand/convince myself:
def f(y, z):
y = 2
z.append(4)
print ('In f(): ', id(y), id(z))
def main():
n = 1
x = [0,1,2,3]
print ('Before in main:', n, x,id(n),id(x))
f(n, x)
print ('After in main:', n, x,id(n),id(x))
main()
Before in main: 1 [0, 1, 2, 3] 94635800628352 139808499830024
In f(): 94635800628384 139808499830024
After in main: 1 [0, 1, 2, 3, 4] 94635800628352 139808499830024
z and x have the same id. Just different tags for the same underlying structure as the article says.
My general understanding is that any object variable (such as a list or a dict, among others) can be modified through its functions. What I believe you are not able to do is reassign the parameter - i.e., assign it by reference within a callable function.
That is consistent with many other languages.
Run the following short script to see how it works:
def func1(x, l1):
x = 5
l1.append("nonsense")
y = 10
list1 = ["meaning"]
func1(y, list1)
print(y)
print(list1)
It´s because a list is a mutable object. You´re not setting x to the value of [0,1,2,3], you´re defining a label to the object [0,1,2,3].
You should declare your function f() like this:
def f(n, x=None):
if x is None:
x = []
...
n is an int (immutable), and a copy is passed to the function, so in the function you are changing the copy.
X is a list (mutable), and a copy of the pointer is passed o the function so x.append(4) changes the contents of the list. However, you you said x = [0,1,2,3,4] in your function, you would not change the contents of x in main().
Python is copy by value of reference. An object occupies a field in memory, and a reference is associated with that object, but itself occupies a field in memory. And name/value is associated with a reference. In python function, it always copy the value of the reference, so in your code, n is copied to be a new name, when you assign that, it has a new space in caller stack. But for the list, the name also got copied, but it refer to the same memory(since you never assign the list a new value). That is a magic in python!
When you are passing the command n = 2 inside the function, it finds a memory space and label it as 2. But if you call the method append, you are basically refrencing to location x (whatever the value is) and do some operation on that.
Python is a pure pass-by-value language if you think about it the right way. A python variable stores the location of an object in memory. The Python variable does not store the object itself. When you pass a variable to a function, you are passing a copy of the address of the object being pointed to by the variable.
Contrast these two functions
def foo(x):
x[0] = 5
def goo(x):
x = []
Now, when you type into the shell
>>> cow = [3,4,5]
>>> foo(cow)
>>> cow
[5,4,5]
Compare this to goo.
>>> cow = [3,4,5]
>>> goo(cow)
>>> goo
[3,4,5]
In the first case, we pass a copy the address of cow to foo and foo modified the state of the object residing there. The object gets modified.
In the second case you pass a copy of the address of cow to goo. Then goo proceeds to change that copy. Effect: none.
I call this the pink house principle. If you make a copy of your address and tell a
painter to paint the house at that address pink, you will wind up with a pink house.
If you give the painter a copy of your address and tell him to change it to a new address,
the address of your house does not change.
The explanation eliminates a lot of confusion. Python passes the addresses variables store by value.
As jouell said. It's a matter of what points to what and i'd add that it's also a matter of the difference between what = does and what the .append method does.
When you define n and x in main, you tell them to point at 2 objects, namely 1 and [1,2,3]. That is what = does : it tells what your variable should point to.
When you call the function f(n,x), you tell two new local variables nf and xf to point at the same two objects as n and x.
When you use "something"="anything_new", you change what "something" points to. When you use .append, you change the object itself.
Somehow, even though you gave them the same names, n in the main() and the n in f() are not the same entity, they only originally point to the same object (same goes for x actually). A change to what one of them points to won't affect the other. However, if you instead make a change to the object itself, that will affect both variables as they both point to this same, now modified, object.
Lets illustrate the difference between the method .append and the = without defining a new function :
compare
m = [1,2,3]
n = m # this tells n to point at the same object as m does at the moment
m = [1,2,3,4] # writing m = m + [4] would also do the same
print('n = ', n,'m = ',m)
to
m = [1,2,3]
n = m
m.append(4)
print('n = ', n,'m = ',m)
In the first code, it will print n = [1, 2, 3] m = [1, 2, 3, 4], since in the 3rd line, you didnt change the object [1,2,3], but rather you told m to point to a new, different, object (using '='), while n still pointed at the original object.
In the second code, it will print n = [1, 2, 3, 4] m = [1, 2, 3, 4]. This is because here both m and n still point to the same object throughout the code, but you modified the object itself (that m is pointing to) using the .append method... Note that the result of the second code will be the same regardless of wether you write m.append(4) or n.append(4) on the 3rd line.
Once you understand that, the only confusion that remains is really to understand that, as I said, the n and x inside your f() function and the ones in your main() are NOT the same, they only initially point to the same object when you call f().
Please allow me to edit again. These concepts are my experience from learning python by try error and internet, mostly stackoverflow. There are mistakes and there are helps.
Python variables use references, I think reference as relation links from name, memory adress and value.
When we do B = A, we actually create a nickname of A, and now the A has 2 names, A and B. When we call B, we actually are calling the A. we create a ink to the value of other variable, instead of create a new same value, this is what we call reference. And this thought would lead to 2 porblems.
when we do
A = [1]
B = A # Now B is an alias of A
A.append(2) # Now the value of A had been changes
print(B)
>>> [1, 2]
# B is still an alias of A
# Which means when we call B, the real name we are calling is A
# When we do something to B, the real name of our object is A
B.append(3)
print(A)
>>> [1, 2, 3]
This is what happens when we pass arguments to functions
def test(B):
print('My name is B')
print(f'My value is {B}')
print(' I am just a nickname, My real name is A')
B.append(2)
A = [1]
test(A)
print(A)
>>> [1, 2]
We pass A as an argument of a function, but the name of this argument in that function is B.
Same one with different names.
So when we do B.append, we are doing A.append
When we pass an argument to a function, we are not passing a variable , we are passing an alias.
And here comes the 2 problems.
the equal sign always creates a new name
A = [1]
B = A
B.append(2)
A = A[0] # Now the A is a brand new name, and has nothing todo with the old A from now on.
B.append(3)
print(A)
>>> 1
# the relation of A and B is removed when we assign the name A to something else
# Now B is a independent variable of hisown.
the Equal sign is a statesment of clear brand new name,
this was the concused part of mine
A = [1, 2, 3]
# No equal sign, we are working on the origial object,
A.append(4)
>>> [1, 2, 3, 4]
# This would create a new A
A = A + [4]
>>> [1, 2, 3, 4]
and the function
def test(B):
B = [1, 2, 3] # B is a new name now, not an alias of A anymore
B.append(4) # so this operation won't effect A
A = [1, 2, 3]
test(A)
print(A)
>>> [1, 2, 3]
# ---------------------------
def test(B):
B.append(4) # B is a nickname of A, we are doing A
A = [1, 2, 3]
test(A)
print(A)
>>> [1, 2, 3, 4]
the first problem is
the left side of and equation is always a brand new name, new variable,
unless the right side is a name, like B = A, this create an alias only
The second problem, there are something would never be changed, we cannot modify the original, can only create a new one.
This is what we call immutable.
When we do A= 123 , we create a dict which contains name, value, and adress.
When we do B = A, we copy the adress and value from A to B, all operation to B effect the same adress of the value of A.
When it comes to string, numbers, and tuple. the pair of value and adress could never be change. When we put a str to some adress, it was locked right away, the result of all modifications would be put into other adress.
A = 'string' would create a protected value and adess to storage the string 'string' . Currently, there is no built-in functions or method cound modify a string with the syntax like list.append, because this code modify the original value of a adress.
the value and adress of a string, a number, or a tuple is protected, locked, immutable.
All we can work on a string is by the syntax of A = B.method , we have to create a new name to storage the new string value.
please extend this discussion if you still get confused.
this discussion help me to figure out mutable / immutable / refetence / argument / variable / name once for all, hopely this could do some help to someone too.
##############################
had modified my answer tons of times and realized i don't have to say anything, python had explained itself already.
a = 'string'
a.replace('t', '_')
print(a)
>>> 'string'
a = a.replace('t', '_')
print(a)
>>> 's_ring'
b = 100
b + 1
print(b)
>>> 100
b = b + 1
print(b)
>>> 101
def test_id(arg):
c = id(arg)
arg = 123
d = id(arg)
return
a = 'test ids'
b = id(a)
test_id(a)
e = id(a)
# b = c = e != d
# this function do change original value
del change_like_mutable(arg):
arg.append(1)
arg.insert(0, 9)
arg.remove(2)
return
test_1 = [1, 2, 3]
change_like_mutable(test_1)
# this function doesn't
def wont_change_like_str(arg):
arg = [1, 2, 3]
return
test_2 = [1, 1, 1]
wont_change_like_str(test_2)
print("Doesn't change like a imutable", test_2)
This devil is not the reference / value / mutable or not / instance, name space or variable / list or str, IT IS THE SYNTAX, EQUAL SIGN.
I have read that while writing functions it is good practice to copy the arguments into other variables because it is not always clear whether the variable is immutable or not. [I don't remember where so don't ask]. I have been writing functions according to this.
As I understand creating a new variable takes some overhead. It may be small but it is there. So what should be done? Should I be creating new variables or not to hold the arguments?
I have read this and this. I have confusion regarding as to why float's and int's are immutable if they can be changed this easily?
EDIT:
I am writing simple functions. I'll post example. I wrote the first one when after I read that in Python arguments should be copied and the second one after I realized by hit-and-trial that it wasn't needed.
#When I copied arguments into another variable
def zeros_in_fact(num):
'''Returns the number of zeros at the end of factorial of num'''
temp = num
if temp < 0:
return 0
fives = 0
while temp:
temp /= 5
fives += temp
return fives
#When I did not copy arguments into another variable
def zeros_in_fact(num):
'''Returns the number of zeros at the end of factorial of num'''
if num < 0:
return 0
fives = 0
while num:
num /= 5
fives += num
return fives
I think it's best to keep it simple in questions like these.
The second link in your question is a really good explanation; in summary:
Methods take parameters which, as pointed out in that explanation, are passed "by value". The parameters in functions take the value of variables passed in.
For primitive types like strings, ints, and floats, the value of the variable is a pointer (the arrows in the following diagram) to a space in memory that represents the number or string.
code | memory
|
an_int = 1 | an_int ----> 1
| ^
another_int = 1 | another_int /
When you reassign within the method, you change where the arrow points.
an_int = 2 | an_int -------> 2
| another_int --> 1
The numbers themselves don't change, and since those variables have scope only inside the functions, outside the function, the variables passed in remain the same as they were before: 1 and 1. But when you pass in a list or object, for example, you can change the values they point to outside of the function.
a_list = [1, 2, 3] | 1 2 3
| a_list ->| ^ | ^ | ^ |
| 0 2 3
a_list[0] = 0 | a_list ->| ^ | ^ | ^ |
Now, you can change where the arrows in the list, or object, point to, but the list's pointer still points to the same list as before. (There should probably actually only be one 2 and 3 in the diagram above for both sets of arrows, but the arrows would have gotten difficult to draw.)
So what does the actual code look like?
a = 5
def not_change(a):
a = 6
not_change(a)
print(a) # a is still 5 outside the function
b = [1, 2, 3]
def change(b):
b[0] = 0
print(b) # b is now [0, 2, 3] outside the function
Whether you make a copy of the lists and objects you're given (ints and strings don't matter) and thus return new variables or change the ones passed in depends on what functionality you need to provide.
What you are doing in your code examples involves no noticeable overhead, but it also doesn't accomplish anything because it won't protect you from mutable/immutable problems.
The way to think about this is that there are two kinds of things in Python: names and objects. When you do x = y you are operating on a name, attaching that name to the object y. When you do x += y or other augmented assignment operators, you also are binding a name (in addition to doing the operation you use, + in this case). Anything else that you do is operating on objects. If the objects are mutable, that may involve changing their state.
Ints and floats cannot be changed. What you can do is change what int or float a name refers to. If you do
x = 3
x = x + 4
You are not changing the int. You are changing the name x so that it now is attached to the number 7 instead of the number 3. On the other hand when you do this:
x = []
x.append(2)
You are changing the list, not just pointing the name at a new object.
The difference can be seen when you have multiple names for the same object.
>>> x = 2
>>> y = x
>>> x = x + 3 # changing the name
>>> print x
5
>>> print y # y is not affected
2
>>> x = []
>>> y = x
>>> x.append(2) # changing the object
>>> print x
[2]
>>> print y # y is affected
[2]
Mutating an object means that you alter the object itself, so that all names that point to it see the changes. If you just change a name, other names are not affected.
The second question you linked to provides more information about how this works in the context of function arguments. The augmented assignment operators (+=, *=, etc.) are a bit trickier since they operate on names but may also mutate objects at the same time. You can find other questions on StackOverflow about how this works.
If you are rebinding the name then mutability of the object it contains is irrelevant. Only if you perform mutating operations must you create a copy. (And if you read between the lines, that indirectly says "don't mutate objects passed to you".)
This question already has answers here:
Does Python make a copy of objects on assignment?
(5 answers)
How do I pass a variable by reference?
(39 answers)
Why can a function modify some arguments as perceived by the caller, but not others?
(13 answers)
Closed last month.
For a project I'm working on, I'm implementing a linked-list data-structure, which is based on the idea of a pair, which I define as:
class Pair:
def __init__(self, name, prefs, score):
self.name = name
self.score = score
self.preferences = prefs
self.next_pair = 0
self.prev_pair = 0
where self.next_pair and self.prev_pair are pointers to the previous and next links, respectively.
To set up the linked-list, I have an install function that looks like this.
def install(i, pair):
flag = 0
try:
old_pair = pair_array[i]
while old_pair.next_pair != 0:
if old_pair == pair:
#if pair in remainders: remainders.remove(pair)
return 0
if old_pair.score < pair.score:
flag = 1
if old_pair.prev_pair == 0: # we are at the beginning
old_pair.prev_pair = pair
pair.next_pair = old_pair
pair_array[i] = pair
break
else: # we are not at the beginning
pair.prev_pair = old_pair.prev_pair
pair.next_pair = old_pair
old_pair.prev_pair = pair
pair.prev_pair.next_pair = pair
break
else:
old_pair = old_pair.next_pair
if flag==0:
if old_pair == pair:
#if pair in remainders: remainders.remove(pair)
return 0
if old_pair.score < pair.score:
if old_pair.prev_pair==0:
old_pair.prev_pair = pair
pair.next_pair = old_pair
pair_array[i] = pair
else:
pair.prev_pair = old_pair.prev_pair
pair.next_pair = old_pair
old_pair.prev_pair = pair
pair.prev_pair.next_pair = pair
else:
old_pair.next_pair = pair
pair.prev_pair = old_pair
except KeyError:
pair_array[i] = pair
pair.prev_pair = 0
pair.next_pair = 0
Over the course of the program, I am building up a dictionary of these linked-lists, and taking links off of some and adding them in others. Between being pruned and re-installed, the links are stored in an intermediate array.
Over the course of debugging this program, I have come to realize that my understanding of the way Python passes arguments to functions is flawed. Consider this test case I wrote:
def test_install():
p = Pair(20000, [3, 1, 2, 50], 45)
print p.next_pair
print p.prev_pair
parse_and_get(g)
first_run()
rat = len(juggler_array)/len(circuit_array)
pref_size = get_pref_size()
print pref_size
print install(3, p)
print p.next_pair.name
print p.prev_pair
When I run this test, I get the following result.
0
0
10
None
10108
0
What I don't understand is why the second call to p.next_pair produces a different result (10108) than the first call (0). install does not return a Pair object that can overwrite the one passed in (it returns None), and it's not as though I'm passing install a pointer.
My understanding of call-by-value is that the interpreter copies the values passed into a function, leaving the caller's variables unchanged. For example, if I say
def foo(x):
x = x+1
return x
baz = 2
y = foo(baz)
print y
print baz
Then 3 and 2 should be printed, respectively. And indeed, when I test that out in the Python interpreter, that's what happens.
I'd really appreciate it if anyone can point me in the right direction here.
In Python, everything is an object. Simple assignment stores a reference to the assigned object in the assigned-to name. As a result, it is more straightforward to think of Python variables as names that are assigned to objects, rather than objects that are stored in named locations.
For example:
baz = 2
... stores in baz a pointer, or reference, to the integer object 2 which is stored elsewhere. (Since the type int is immutable, Python actually has a pool of small integers and reuses the same 2 object everywhere, but this is an implementation detail that need not concern us much.)
When you call foo(baz), foo()'s local variable x also points to the integer object 2 at first. That is, the foo()-local name x and the global name baz are names for the same object, 2. Then x = x + 1 is executed. This changes x to point to a different object: 3.
It is important to understand: x is not a box that holds 2, and 2 is then incremented to 3. No, x initially points to 2 and that pointer is then changed to point to 3. Naturally, since we did not change what object baz points to, it still points to 2.
Another way to explain it is that in Python, all argument passing is by value, but all values are references to objects.
A counter-intuitive result of this is that if an object is mutable, it can be modified through any reference and all references will "see" the change. For example, consider this:
baz = [1, 2, 3]
def foo(x):
x[0] = x[0] + 1
foo(baz)
print baz
>>> [2, 2, 3]
This seems very different from our first example. But in reality, the argument is passed the same way. foo() receives a pointer to baz under the name x and then performs an operation on it that changes it (in this case, the first element of the list is pointed to a different int object). The difference is that the name x is never pointed to a new object; it is x[0] that is modified to point to a different object. x itself still points to the same object as baz. (In fact, under the hood the assignment to x[0] becomes a method call: x.__setitem__().) Therefore baz "sees" the modification to the list. How could it not?
You don't see this behavior with integers and strings because you can't change integers or strings; they are immutable types, and when you modify them (e.g. x = x + 1) you are not actually modifying them but binding your variable name to a completely different object. If you change baz to a tuple, e.g. baz = (1, 2, 3), you will find that foo() gives you an error because you can`t assign to elements of a tuple; tuples are another immutable type. "Changing" a tuple requires creating a new one, and assignment then points the variable to the new object.
Objects of classes you define are mutable and so your Pair instance can be modified by any function it is passed into -- that is, attributes may be added, deleted, or reassigned to other objects. None of these things will re-bind any of the names pointing to your object, so all the names that currently point to it will "see" the changes.
Python does not copy anything when passing variables to a function. It is neither call-by-value nor call-by-reference, but of those two it is more similar to call-by-reference. You could think of it as "call-by-value, but the value is a reference".
If you pass a mutable object to a function, then modifying that object inside the function will affect the object everywhere it appears. (If you pass an immutable object to a function, like a string or an integer, then by definition you can't modify the object at all.)
The reason this isn't technically pass-by-reference is that you can rebind a name so that the name refers to something else entirely. (For names of immutable objects, this is the only thing you can do to them.) Rebinding a name that exists only inside a function doesn't affect any names that might exist outside the function.
In your first example with the Pair objects, you are modifying an object, so you see the effects outside of the function.
In your second example, you are not modifying any objects, you are just rebinding names to other objects (other integers in this case). baz is a name that points to an integer object (in Python, everything is an object, even integers) with a value of 2. When you pass baz to foo(x), the name x is created locally inside the foo function on the stack, and x is set to the pointer that was passed into the function -- the same pointer as baz. But x and baz are not the same thing, they only contain pointers to the same object. On the x = x+1 line, x is rebound to point to an integer object with a value of 3, and that pointer is what is returned from the function and used to bind the integer object to y.
If you rewrote your first example to explicitly create a new Pair object inside your function based on the information from the Pair object passed into it (whether this is a copy you then modify, or if you make a constructor that modifies the data on construction) then your function would not have the side-effect of modifying the object that was passed in.
Edit: By the way, in Python you shouldn't use 0 as a placeholder to mean "I don't have a value" -- use None. And likewise you shouldn't use 0 to mean False, like you seem to be doing in flag. But all of 0, None and False evaluate to False in boolean expressions, so no matter which of those you use, you can say things like if not flag instead of if flag == 0.
I suggest that you forget about implementing a linked list, and simply use an instance of a Python list. If you need something other than the default Python list, maybe you can use something from a Python module such as collections.
A Python loop to follow the links in a linked list will run at Python interpreter speed, which is to say, slowly. If you simply use the built-in list class, your list operations will happen in Python's C code, and you will gain speed.
If you need something like a list but with fast insertion and fast deletion, can you make a dict work? If there is some sort of ID value (string or integer or whatever) that can be used to impose an ordering on your values, you could just use that as a key value and gain lightning fast insert and delete of values. Then if you need to extract values in order, you can use the dict.keys() method function to get a list of key values and use that.
But if you really need linked lists, I suggest you find code written and debugged by someone else, and adapt it to your needs. Google search for "python linked list recipe" or "python linked list module".
I'm going to throw in a slightly complicating factor:
>>> def foo(x):
... x *= 2
... return x
...
Define a slightly different function using a method I know is supported for numbers, lists, and strings.
First, call it with strings:
>>> baz = "hello"
>>> y = foo(baz)
>>> y
'hellohello'
>>> baz
'hello'
Next, call it with lists:
>>> baz=[1,2,2]
>>> y = foo(baz)
>>> y
[1, 2, 2, 1, 2, 2]
>>> baz
[1, 2, 2, 1, 2, 2]
>>>
With strings, the argument isn't modified. With lists, the argument is modified.
If it were me, I'd avoid modifying arguments within methods.