assign references - python

Is there a way to assign references in python?
For example, in php i can do this:
$a = 10;
$b = &$a;
$a = 20;
echo $a." ".$b; // 20, 20
how can i do same thing in python?

In python, if you're doing this with non-primitive types, it acts exactly like you want: assigning is done using references. That's why, when you run the following:
>>> a = {'key' : 'value'}
>>> b = a
>>> b['key'] = 'new-value'
>>> print a['key']
you get 'new-value'.
Strictly saying, if you do the following:
>>> a = 5
>>> b = a
>>> print id(a) == id(b)
you'll get True.
But! Because of primitive types are immutable, you cant change the value of variable b itself. You are just able create a new variable with a new value, based on b. For example, if you do the following:
>>> print id(b)
>>> b = b + 1
>>> print id(b)
you'll get two different values.
This means that Python created a new variable, computed its value basing on b's value and then gave this new variable the name b. This concerns all of the immutable types. Connecting two previous examples together:
>>> a = 5
>>> b = a
>>> print id(a)==id(b)
True
>>> b += 1
>>> print id(b)==id(a)
False
So, when you assign in Python, you always assign reference. But some types cannot be changed, so when you do some changes, you actually create a new variable with another reference.

In Python, everything is by default a reference. So when you do something like:
x=[1,2,3]
y=x
x[1]=-1
print y
It prints [1,-1,3].
The reason this does not work when you do
x=1
y=x
x=-1
print y
is that ints are immutable. They cannot be changed. Think about it, does a number really ever change? When you assign a new value to x, you are assigning a new value - not changing the old one. So y still points to the old one. Other immutable types (e.g. strings and tuples) behave in the same way.

Related

How does python store things in lists vs integers?

If I do the following:
v = [0,0,0,0]
v2 = v
v[0]=5
print(v2)
the change to list v changes list v2.
a=5
b=a
a=6
print(b)
If I do this on the other hand, changing a doesnt change b. Whats the difference here? When I print id(a) and id(b) they give the same number, so shouldn't they be referencing the same object and change just like the list does?
Your original question is really asking about two different things.
The first is this, with some annotation - you ask about lists. Lists are objects that are mutable sequences, and in your code, v and v2 always refer to the same list. You can modify the contents of the list using any referrer to it and those changes are visible to anything that refers to it.
v = [0,0,0,0] # create a list object and assign reference to v
v2 = v # assign v2 to be the reference to list v also refers to
v[0]=5 # modify first list element
print(v2) # print the list v2 refers to, which is the same list v refers to
In the second piece of code you show, you're changing what a variable refers to, rather than changing the underlying value of an object.
a=5 # Assign a to be a reference to 5
b=a # Assign b to be a reference to the thing a refers to
a=6 # Re-assign a to refer to 6, a now refers to a different object than b
print(b) # b still refers to 5
And you pointed out the use of id. I will also introduce sys.getrefcount() which lets you see what the reference count for any particular object is, so we can see that, say, v's referred-to list has multiple things referring to it.
import sys
v = [0,0,0,0]
v2 = v
v[0]=5
print(f"id(v) = {id(v)}")
print(f"id(v2) = {id(v2)}")
# this shows 3 instead of 2 because getrefcount(x)
# increases refcount of x by 1
print(f"v referrers: {sys.getrefcount(v)}")
del v2 # remove v2 as a reference
# and this will show 2 because v2 doesn't
# exist/doesn't refer to it anymore
print(f"v referrers: {sys.getrefcount(v)}")
a = 5
b = 5
print(f"id(a) = {id(a)}")
print(f"id(b) = {id(b)}")
# you're reassigning a here, so the id will change
# but you didn't change what b refers to
a = 6
print(f"id(a) = {id(a)}")
print(f"id(b) = {id(b)}")
And the output of this would look something like this...
id(v) = 4366332480
id(v2) = 4366332480
v referrers: 3
v referrers: 2
id(a) = 4365582704
id(b) = 4365582704
id(a) = 4365582736
id(b) = 4365582704
And as I mentioned - CPython does some special stuff for numbers in the range of [-5, 256] and keeps a static list of them in memory, so id on any integer in that range should return the same value for any referrers to them.

Is the marked information really lost?

a = a[0] = [['Is this information lost?']]
print(a)
Is there any way to gain the string again?
If not, how is this memory-wise handled?
This example is a bit more illustrative of what's going on:
>>> b = [1, 2]
>>> print(id(b))
22918532837512
>>> a = a[0] = b
>>> print(a)
[[...], 2]
>>> print(id(a))
22918532837512
>>> print(id(a[0]))
22918532837512
>>> print(b)
[[...], 2]
>>> print(id(b))
22918532837512
It is important to understand here that = is not formally an operator in Python, but rather a delimiter, part of the syntax for assignment statements. In Python, unlike in C or C++, assignments are not expressions. Multi-assignment statements such as x = y = z are directly accommodated by assignment-statement syntax, not as a consequence of using a single assignment as an expression. A multi-assignment specifies that the value(s) being assigned should be assigned to each target (list), so that x = y = z is equivalent to
x, y = z, z
Except that z is evaluated only once. Python does not define the order in which the targets are assigned, but CPython does it left-to-right, so that the above works about the same as
x = z
y = z
except, again, for the multiple evaluation of z.
And with that, we can understand the original statement. This:
a = a[0] = [['Is this information lost?']]
works like
temp = [['Is this information lost?']]
a = temp
a[0] = temp
del temp
, except that it does not involve a temporary name binding for the target list. Indeed, it should be clear that the previous is also equivalent to this, which is how I imagine most people would write it:
a = [['Is this information lost?']]
a[0] = a
Thus, to answer the original question, the string 'Is this information lost?' was never accessible other than via the list, so the assignment to a[0] leaves no name binding through which the string can be reached. The string is indeed lost at that point, in that sense. Python will continue to track it, however, until it is garbage collected.
As far as I can tell, a is a circular structure -- a list whose one and only element is itself.
>>> len(a)
1
>>> a
[[...]]
>>> len(a[0])
1
>>> a[0]
[[...]]
There is no longer any reference to the string, and hence no way to recover it from the Python interpreter. It should also therefore have been garbage-collected and no longer be in memory (although this is not guaranteed, and it may still be cached but inaccessible).

In Python, why doesn't 'y = x; y += 1' also increment x?

First create a function for displaying reference count (note that we have to -1 each time to get the correct value, as the function itself INCREF-s the argument)
>>> from sys import getrefcount as rc
>>> x=1.1
>>> rc(x)-1
1
Now make another reference to the same PyObject:
>>> y=x
>>> rc(x)-1
2
>>> rc(y)-1
2
>>> x is y
True
Now perform an operation on the second handle, y:
>>> y+=1
This should be invoking PyNumber_InPlaceAdd on the PyObject that y points to.
So if this is true I would be expecting x to also read 2.1
>>> x,y
(1.1, 2.1)
>>> x is y
False
>>> rc(x)-1
1
>>> rc(y)-1
1
So my question is, what is Python doing internally to provide the right behaviour, rather than the behaviour I would expect from looking at PyNumber_InPlaceAdd?
(Note: I am using 1.1; if I used 1 the initial reference count would be >300, because 1 must be used all over the place behind-the-scenes in CPython, and it is clever enough to reuse objects.)
(This also begs the question: if I have foo = 20; bar = 19; bar += 1 does this mean it has to look through all its objects and check whether there already exists an object with this value, and if so reuse it? A simple test shows that the answer is no. Which is good news. It would be horribly slow once the program size gets big. So Python must just optimise for small integers.)
You don't need getrefcount for this, you can just use id:
>>> x = 1.1
>>> id(x)
50107888
>>> y = x
>>> id(y)
50107888 # same object
>>> y += 1
>>> id(y)
40186896 # different object
>>> id(x)
50107888 # no change there
float objects (along with e.g. str and int) are immutable in Python, they cannot be changed in-place. The addition operation therefore creates a new object, with the new value, and assigns it to the name y, effectively:
temp = y + 1
y = temp
In CPython, integers from -5 to 256 inclusive are "interned", i.e. stored for reuse, such that any operation with the result e.g. 1 will give a reference to the same object. This saves memory compared to creating new objects for these frequently-used values each time they're needed. You're right that it would be a pain to search all existing objects for a match every time a new object might be needed, so this is only done over a limited range. Using a contiguous range also means that the "search" is really just an offset in an array.
Now perform an operation on the second handle, y:
>>> y+=1
This should be invoking PyNumber_InPlaceAdd on the PyObject that y
points to.
Up to here you are right.
But in-place adding of numbers returns a distinct object, not the old one.
The old one, as it is immutable, keeps its value.

Python object references

I'm aware that in python every identifier or variable name is a reference to the actual object.
a = "hello"
b = "hello"
When I compare the two strings
a == b
the output is
True
If I write an equivalent code in Java,the output would be false because the comparison is between references(which are different) but not the actual objects.
So what i see here is that the references(variable names) are replaced by actual objects by the interpreter at run time.
So,is is safe for me to assume that "Every time the interpreter sees an already assigned variable name,it replaces it with the object it is referring to" ? I googled it but couldn't find any appropriate answer I was looking for.
If you actually ran that in Java, I think you'd find it probably prints out true because of string interning, but that's somewhat irrelevant.
I'm not sure what you mean by "replaces it with the object it is referring to". What actually happens is that when you write a == b, Python calls a.__eq__(b), which is just like any other method call on a with b as an argument.
If you want an equivalent to Java-like ==, use the is operator: a is b. That compares whether the name a refers to the same object as b, regardless of whether they compare as equal.
Python interning:
>>> a = "hello"
>>> b = "hello"
>>> c = "world"
>>> id(a)
4299882336
>>> id(b)
4299882336
>>> id(c)
4299882384
Short strings tend to get interned automatically, explaining why a is b == True. See here for more.
To show that equal strings don't always have the same id
>>> a = "hello"+" world"
>>> b = "hello world"
>>> c = a
>>> a == b
True
>>> a is b
False
>>> b is c
False
>>> a is c
True
also:
>>> str([]) == str("[]")
True
>>> str([]) is str("[]")
False

Python create new object with the same value [duplicate]

This question already has answers here:
How do I clone a list so that it doesn't change unexpectedly after assignment?
(24 answers)
Closed 15 days ago.
I come from java world where I expect following things
int a = valueassignedbyfunction();
int b = a;
a = a + 1;
after this a is 1 greater than b. But in python the b automatically gets incremented by one once the a = a + 1 operation is done because this b is referencing to the same object as a does. How can I copy only the value of a and assign it to a new object called b?
Thanks!
Assuming integers, I cannot reproduce your issue:
>>> a = 1
>>> b = a
>>> a += 1
>>> a
2
>>> b
1
If we assume objects instead:
class Test(object):
... def __init__(self, v):
... self.v = v
...
>>> a = Test(1)
>>> b = a.v
>>> a.v += 1
>>> print a.v, b
2 1
# No issues so far
# Let's copy the object instead
>>> b = a
>>> a.v += 1
>>> print a.v, b.v
3 3
# Ah, there we go
# Using user252462's suggestion
>>> from copy import deepcopy
>>> b = deepcopy(a)
>>> a.v += 1
>>> print a.v, b.v
4 3
I think the main confusion here is the following: In Java, a line like
int i = 5;
allocates memory for an integer and associates the name i with this memory location. You can somehow identify the name i with this memory location and its type and call the whole thing "the integer variable i".
In Python, the line
i = 5
evaluates the expression on the right hand side, which will yield a Python object (in this case, the expression is really simple and will yield the integer object 5). The assignment statement makes the name i point to that object, but the relation between the name and the object is a completely different one than in Java. Names are always just references to objects, and there may be many names referencing the same object or no name at all.
This documentation might help out: http://docs.python.org/library/copy.html
You can use the copy library to deepcopy objects:
import copy
b = copy.deepcopy(a)
I'm not sure what you're seeing here.
>>> a = 1
>>> b = a
>>> a = a + 1
>>> b
1
>>> a
2
>>> a is b
False
Python Integers are immutable, the + operation assigns creates a new object with value a+1. There are some weird reference issues with integers (http://distilledb.com/blog/archives/date/2009/06/18/python-gotcha-integer-equality.page), but you should get the same thing you expected in Java
How about just doing
a = 1
b = a*1

Categories

Resources