This question already has answers here:
How do I pass a variable by reference?
(39 answers)
Closed 7 years ago.
Okay a very silly question I'm sure. But how does python assign value to variables?
Say there is a variable a and is assigned the value a=2. So python assigns a memory location to the variable and a now points to the memory location that contains the value 2. Now, if I assign a variable b=a the variable b also points to the same location as variable a.
Now. If I assign a variable c=2 it still points to the same memory location as a instead of pointing to a new memory location. So, how does python work? Does it check first check all the previously assigned variables to check if any of them share the same values and then assign it the memory location?
Also, it doesn't work the same way with lists. If I assign a=[2,3] and then b=[2,3] and check their memory locations with the id function, I get two different memory locations.But c=b gives me the same location. Can someone explain the proper working and reason for this?
edit :-
Basically my question is because I've just started learning about the is operator and apparently it holds True only if they are pointing to the same location. So, if a=1000 and b=1000 a is b is False but, a="world" b="world" it holds true.
I've faced this problem before and understand that it gets confusing. There are two concepts here:
some data structures are mutable, while others are not
Python works off pointers... most of the time
So let's consider the case of a list (you accidentally stumbled on interning and peephole optimizations when you used ints - I'll get to that later)
So let's create two identical lists (remember lists are mutable)
In [42]: a = [1,2]
In [43]: b = [1,2]
In [44]: id(a) == id(b)
Out[44]: False
In [45]: a is b
Out[45]: False
See, despite the fact that the lists are identical, a and b are different memory locations. Now, this is because python computes [1,2], assigns it to a memory location, and then calls that location a (or b). It would take quite a long time for python to check every allocated memory location to see if [1,2] already exists, to assign b to the same memory location as a.
And that's not to mention that lists are mutable, i.e. you can do the following:
In [46]: a = [1,2]
In [47]: id(a)
Out[47]: 4421968008
In [48]: a.append(3)
In [49]: a
Out[49]: [1, 2, 3]
In [50]: id(a)
Out[50]: 4421968008
See that? The value that a holds has changed, but the memory location has not. Now, what if a bunch of other variable names were assigned to the same memory location?! they would be changed as well, which would be a flaw with the language. In order to fix this, python would have to copy over the entire list into a new memory location, just because I wanted to change the value of a
This is true even of empty lists:
In [51]: a = []
In [52]: b = []
In [53]: a is b
Out[53]: False
In [54]: id(a) == id(b)
Out[54]: False
Now, let's talk about that stuff I said about pointers:
Let's say you want two variables to actually talk about the same memory location. Then, you could assign your second variable to your first:
In [55]: a = [1,2,3,4]
In [56]: b = a
In [57]: id(a) == id(b)
Out[57]: True
In [58]: a is b
Out[58]: True
In [59]: a[0]
Out[59]: 1
In [60]: b[0]
Out[60]: 1
In [61]: a
Out[61]: [1, 2, 3, 4]
In [62]: b
Out[62]: [1, 2, 3, 4]
In [63]: a.append(5)
In [64]: a
Out[64]: [1, 2, 3, 4, 5]
In [65]: b
Out[65]: [1, 2, 3, 4, 5]
In [66]: a is b
Out[66]: True
In [67]: id(a) == id(b)
Out[67]: True
In [68]: b.append(6)
In [69]: a
Out[69]: [1, 2, 3, 4, 5, 6]
In [70]: b
Out[70]: [1, 2, 3, 4, 5, 6]
In [71]: a is b
Out[71]: True
In [72]: id(a) == id(b)
Out[72]: True
Look what happened there! a and b are both assigned to the same memory location. Therefore, any changes you make to one, will be reflected on the other.
Lastly, let's talk briefly about that peephole stuff I mentioned before. Python tries to save space. So, it loads a few small things into memory when it starts up (small integers, for example). As a result, when you assign a variable to a small integer (like 5), python doesn't have to compute 5 before assigning the value to a memory location, and assigning a variable name to it (unlike it did in the case of your lists). Since it already knows what 5 is, and has it stashed away in some memory location, all it does is assign that memory location a variable name. However, for much larger integers, this is no longer the case:
In [73]: a = 5
In [74]: b = 5
In [75]: id(a) == id(b)
Out[75]: True
In [76]: a is b
Out[76]: True
In [77]: a = 1000000000
In [78]: b = 1000000000
In [79]: id(a) == id(b)
Out[79]: False
In [80]: a is b
Out[80]: False
This is an optimization that python performs for small integers. In general, you can't count on a and c pointing to the same location. If you try this experiment with progressively larger integers you'll see that it stops working at some point. I'm pretty sure 1000 is large enough but I'm not near a computer; I thought I remembered it being all integers from -128 to 127 are handled this way (or some other "round number").
Your understanding is generally correct, but it's worth noting that python lists are totally different animals compared to arrays in C or C++. From the documentation:
id(obj)
Return the “identity” of an object. This is an integer (or long integer) which is guaranteed to be unique and constant for this object during its lifetime. Two objects with non-overlapping lifetimes may have the same id() value.
The simple answer to your question is that lists in python are actually references. This results in their memory addresses being different as the address is that of the reference as opposed to the object as one might expect.
Related
I am studying Wes McKinney's 'Python for data analysis'.
At some point he says:
"When assigning a variable (or name) in Python, you are creating a reference to the object on the righthand side of the equals sign. In practical terms, consider a list of integers:
In [8]: a = [1, 2, 3]
In [9]: b = a
In [11]: a.append(4)
In [12]: b
output will be:
Out[12]: [1, 2, 3, 4]
He reasons as such:
"In some languages, the assignment of b will cause the data [1, 2, 3] to be copied. In Python, a and b actually now refer to the same object, the original list"
My question is that why the same thing does not occur in the case below:
In [8]: a = 5
In [9]: b = a
In [11]: a +=1
In [12]: b
Where I still get
Out[12]: 5
for b?
In the first case, you're creating a list and both a and b are pointing at this list. When you're changing the list, then both variables are pointers at the list including its changes.
But if you increase the value of a variable that points at an integer. 5 is still 5, you're not changing the integer. You're changing which object the variable a is pointing to. So a is now pointing at the value 6, while b is still pointing at 5. You're not changing the thing that a is pointing to, you're changing WHAT a is pointing to. b doesn't care about that.
I have a question.(python version : 3.9.7)
I run this below code. However, I cannot understand this happening.
plz, let me know why it happens below.
(As far as I know, number is immutable, so when something new as a number is assigned, the object address should point out different address including the number.)
a = np.array([[0,1,2],[3,4,5],[6,7,8]])
id(a[0]) #1977043162384*
A = [0,0,0]; a[0] = A
id(a[0]) #1977043162384 (I cannot understand this part)***
b = [1,2,3]
a[0] = b
id(a[0]) #1977290465808
The ID number on Line 4 should be changed, shouldn't it?
Every time you access an array like that, Python wraps the underlying information in a new object, so:
In [3]: a = np.array([[0,1,2],[3,4,5],[6,7,8]])
In [4]: a[0]
Out[4]: array([0, 1, 2])
In [5]: a[0] is a[0]
Out[5]: False
Or perhaps visualized another way:
In [6]: id(a[0])
Out[6]: 140266652673680
In [7]: id(a[0])
Out[7]: 140268012281648
In [8]: id(a[0])
Out[8]: 140267734662960
In [9]: id(a[0])
Out[9]: 140266652673680
You shouldn't expect id(a[0]) to be different. It's free to re-use the same id because the lifetimes of those objects are not overlapping.
Of course, whether it re-uses that ID is an implementation detail. But why did you expect the ID to change? It is important to understand,
A = [0,0,0]; a[0] = A
Does not in anyway put that list in the array. Instead, the primitive, underlying buffer is modified.
(As far as I know, number is immutable, so when something new as a
number is assigned, the object address should point out different
address including the number)
There are no python objects in your array, you are using a numpy.int64 dtype. This is crucial to understand.
I am confused by the mutable variable in python. See the following example code:
In [144]: a=[1,2,3,4]
In [145]: b=a
In [146]: a.append(5)
In [147]: a
Out[147]: [1, 2, 3, 4, 5]
In [148]: b
Out[148]: [1, 2, 3, 4, 5]
Since list is mutable, when using append function, it works on the same memory. This is understandable to me. But the following code confuses me.
In [149]: import numpy as np
In [150]: a=np.random.randn(3,2)
In [151]: b=a
In [152]: a=a-1
In [153]: a
Out[153]:
array([[-2.05342905, -1.21441195],
[-1.29901352, -3.29416381],
[-2.28775209, -1.65702149]])
In [154]: b
Out[154]:
array([[-1.05342905, -0.21441195],
[-0.29901352, -2.29416381],
[-1.28775209, -0.65702149]])
Since the Numpy array variable is also mutable, when a=a-1, why the change didn't be made on the same memory that a refers to? On the other hand, a refers to a new memory with new values.
Why variable a didn't act similarly in the first example about appending a new value 5 in the list, a still refers to the same memory?
Because when you call a=a-1 you are assigning a new value to variable a... then the whole memory allocation to this variable is changed since a-1 isn't changing a in-place, and instead creates another object..
a = [1,2,3]
b = a
a = a + [4]
a
>>> [1,2,3,4]
b
>>> [1,2,3]
See? It has nothing to do with numpy specificly...
Short answer: a -= 1 is inplace but a = a-1 copies memory of a to another place and then subtracts 1. Therefore, the location where a originally pointed, changes.
You can check this using is. The is keyword says if two variables refer to the same object in memory or not, it is different from ==.
>>> import numpy as np
>>> a = np.random.randn(3,2)
>>> b = a
>>> a is b
True
>>> a, b
(array([[-0.14563848, 2.11951025],
[ 0.50913228, -0.61049821],
[ 2.29055958, -0.83795141]]), array([[-0.14563848, 2.11951025],
[ 0.50913228, -0.61049821],
[ 2.29055958, -0.83795141]]))
>>> a -= 1
>>> a
array([[-1.14563848, 1.11951025],
[-0.49086772, -1.61049821],
[ 1.29055958, -1.83795141]])
>>> b
array([[-1.14563848, 1.11951025],
[-0.49086772, -1.61049821],
[ 1.29055958, -1.83795141]])
>>> a is b
True
>>> a = a - 1
>>> a
array([[-2.14563848, 0.11951025],
[-1.49086772, -2.61049821],
[ 0.29055958, -2.83795141]])
>>> b
array([[-1.14563848, 1.11951025],
[-0.49086772, -1.61049821],
[ 1.29055958, -1.83795141]])
>>> a is b
False
This question already has answers here:
Is there a difference between "==" and "is"?
(13 answers)
Closed 2 years ago.
a=b=[1,2,3]
print (a is b) #True
But
a=[1,2,3]
print (a is [1,2,3]) #False
Why does the second part print False ?
Multiple assignment in Python creates two names that point to the same object. For example,
>>> a=b=[1,2,3]
>>> a[0] = 10
>>> b
[10, 2, 3]
is can be used to check whether two names (a and b) hold the reference to the same memory location (object). Therefore,
a=b=[1,2,3] # a and b hold the same reference
print (a is b) # True
Now in this example,
a = [1,2,3]
print (a is [1,2,3]) # False
a does not hold the same reference to the object [1, 2, 3], even though a and [1, 2, 3] are lists with identical elements.
In case you want to compare whether two lists contain the same elements, you can use ==:
>>> a=b=[1, 2, 3]
>>> a == b
True
>>>
>>> a = [1, 2, 3]
>>> a == [1, 2, 3]
True
Your first one explicitly makes a and b references to the object created by the list display [1,2,3].
In your second code, both uses of the list display [1,2,3] necessarily create new list objects, because lists are mutable and you don't want to implicitly share references to them.
Consider a simpler example:
a = []
b = []
a.append(1)
Do you want b to be modified as well?
For immutable values, like ints, the language implementation may cause literals to reuse references to existing objects, but it's not something that can be relied on.
the problem is the logic operator you are using.
You are asking are these identical object with is and not if they are the equal (same data).
One is a reference to a object and the other is the object so even though they are equal the are not the same.
Why your results
When you are setting a and b as the same list you are saying that a and b should be linked and should reference the same data so they are identical to each other but a and b are not the object [1,2,3] they are a reference to a list that is the same.
In summary
== - equal to (same).
is - identical to.
So if you want to check if they are equal(same) use:
>>> a=[1,2,3]
>>> print (a == [1,2,3])
True
Similar question worth reading:
Is there a difference between "==" and "is"?
Hope this helps, Harry.
This question already has an answer here:
id() vs `is` operator. Is it safe to compare `id`s? Does the same `id` mean the same object?
(1 answer)
Closed 3 years ago.
Two python objects have the same id but "is" operation returns false as shown below:
a = np.arange(12).reshape(2, -1)
c = a.reshape(12, 1)
print("id(c.data)", id(c.data))
print("id(a.data)", id(a.data))
print(c.data is a.data)
print(id(c.data) == id(a.data))
Here is the actual output:
id(c.data) 241233112
id(a.data) 241233112
False
True
My question is... why "c.data is a.data" returns false even though they point to the same ID, thus pointing to the same object? I thought that they point to the same object if they have same ID or am I wrong? Thank you!
a.data and c.data both produce a transient object, with no reference to it. As such, both are immediately garbage-collected. The same id can be used for both.
In your first if statement, the objects have to co-exist while is checks if they are identical, which they are not.
In the second if statement, each object is released as soon as id returns its id.
If you save references to both objects, keeping them alive, you can see they are not the same object.
r0 = a.data
r1 = c.data
assert r0 is not r1
In [62]: a = np.arange(12).reshape(2,-1)
...: c = a.reshape(12,1)
.data returns a memoryview object. id just gives the id of that object; it's not the value of the object, or any indication of where a databuffer is located.
In [63]: a.data
Out[63]: <memory at 0x7f672d1101f8>
In [64]: c.data
Out[64]: <memory at 0x7f672d1103a8>
In [65]: type(a.data)
Out[65]: memoryview
https://docs.python.org/3/library/stdtypes.html#memoryview
If you want to verify that a and c share a data buffer, I find the __array_interface__ to be a better tool.
In [66]: a.__array_interface__['data']
Out[66]: (50988640, False)
In [67]: c.__array_interface__['data']
Out[67]: (50988640, False)
It even shows the offset produced by slicing - here 24 bytes, 3*8
In [68]: c[3:].__array_interface__['data']
Out[68]: (50988664, False)
I haven't seen much use of a.data. It can be used as the buffer object when creating a new array with ndarray:
In [70]: d = np.ndarray((2,6), dtype=a.dtype, buffer=a.data)
In [71]: d
Out[71]:
array([[ 0, 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10, 11]])
In [72]: d.__array_interface__['data']
Out[72]: (50988640, False)
But normally we create new arrays with shared memory with slicing or np.array (copy=False).