how del list[:] works [duplicate] - python

This question already has answers here:
Clarify how python del works for lists and slices
(2 answers)
Closed 3 years ago.
list[:] creates a copy of the list then why does del list[:] remove all the list items?
Shouldn't it delete the copy of the list?

The answer is no, it shouldn't. It's intended to delete all of the elements of the list. Looking at the documentation, the result of s.clear() (where s is a mutable sequence type, for example a list) is:
removes all items from s (same as del s[:])
Hence, del s[:] is the same as s.clear() in that it removes all items from s.
Perhaps, this is a bit more understandable if you consider that the function called behind the scenes is __delitem__. From the docs:
Called to implement deletion of self[key]. Same note as for __getitem__(). This should only be implemented for mappings if the objects support removal of keys, or for sequences if elements can be removed from the sequence. The same exceptions should be raised for improper key values as for the __getitem__() method.
Consider the following difference:
a = [1,2,3]
b = a
del a
# print(a) ## raises an error
print(b) ## prints [1,2,3]
c = [1,2,3]
d = c
del c[:]
print(c) ## prints []
print(d) ## prints []
So why would you want del a[:] to behave this way? Well, think of it as just a special case of deleting a slice of a list. For example, say that you'd want to delete the 3rd, 4th, and 5th element of a long list a = list(range(40)). With the slice notation and the __delitem__ this is easy, just use del a[3:6]. Now try to do the same with a for loop and you'll soon find out it can get quite cumbersome. Heck, just try to delete all the items of a (but not the a itself!) with a for loop ;)

Because a[:] is is simply just a copy of a, consisting of the identical objects as a but a different object than a. Their elements are identical but they are not.
Let's create a list and do some id checks:
a = [1, 2, 3]
print(id(a)) # 97731118088
print(id(a[:])) # 97731213576
print(id(a[:])) # 97731212104
print(id(a[:])) # 97731198600
Notice the changing ids of the copy a[:]. It is an object that is created on the fly every time it is called, and most importantly, even a[:] is a[:] not True! Let's look at the ids of their elements to come to a conclusion:
print(id(a[0])) # 1648192992
print(id(a[:][0])) # 1648192992
a[0] is a[:][0] # True
a[1] is a[:][1] # True
a[2] is a[:][2] # True
So, we can conclude that a[:] is an object that consists of the very elements of a but is a different object than a, and is a different object every time it is called. The elements of a and a[:] are all identical, but they themselves are not.
So what del a[:] does is to remove all of the elements of a. That way a is mutated, and you end up with an empty a, i.e. []. However, del a removes the name a from namespace completely, and when you ask Python to print a for you, you'll get a NameError: name 'a' is not defined.
But how do we know that? Well, let's gain some perspective by disassembling del on a vs a[:]:
Let's define two functions:
def deletion1(a):
del a
def deletion2(a):
del a[:]
Let's disassemble them:
import dis
dis.dis(x = deletion1)
1 0 DELETE_FAST 0 (a)
2 LOAD_CONST 0 (None)
4 RETURN_VALUE
dis.dis(x = deletion2)
1 0 LOAD_FAST 0 (a)
2 LOAD_CONST 0 (None)
4 LOAD_CONST 0 (None)
6 BUILD_SLICE 2
8 DELETE_SUBSCR
10 LOAD_CONST 0 (None)
12 RETURN_VALUE
The dis documentation indicates that the DELETE_FAST operation, which the first function does, simply "Deletes local co_varnames[var_num]". This is basically removal of that name a so that you can't reach the list object anymore. Beware, this does not remove the object that is referred to by the name a, but just removes its name so that the a is not a reference to anything anymore. The object 97731118088 is still the same list, [1, 2, 3]:
import gc
for obj in gc.get_objects():
if id(obj) == 97731118088:
print(obj)
# [1, 2, 3]
On the other hand, again from the documentation, DELETE_SUBSCR "Implements del TOS1[TOS]", which is basically "an in-place operator that removes the top of the stack and pushes the result back on the stack". This way, the stack elements are removed, and you are left with the name a, which now refers to a list whose elements are deleted, i.e. just an "empty shell", if you will. After this operation, a becomes [], but has still the same id of 97731118088. Just its elements are gone via an in-place deletion.

Related

I'm trying to understand how reference works in python

After line 7, I haven't written a single line of code which mentions the list named 'outer'. However, if you execute it, you'll see that the 'outer' (i.e, the nested lists inside it) list would change/update due to lines 10 and 12...
I'm guessing it has something to do with reference vs value. My question is, why didn't line 13 effect (change/update) the 'outer' list the same way that lines 7 and 10 did? I'm trying to undertand this concept. How do I go about it. I know there's a lot of resources online.. but I don't even know what to google. Please help.
inner = []
outer = []
lis = ['a', 'b', 'c']
inner.append(lis[0])
outer.append(inner) <<---- Line 7 <<
print(outer)
inner.append(lis[1]) <<---- Line 10 <<
print(outer)
inner.append(lis[2]) <<---- Line 12 <<
print(outer)
lis[2] = 'x' <<---- Line *******13******* <<
print(outer)
This is a boiled-down version of your example:
some_list = []
a = 2
some_list.append(a)
a = 3
print(some_list) # output: [2]
print(a) # output: 3
If we follow your original logic, you would expect some_list to contain the value 3 when we print it. But the reality is that we never appended a itself to the list. Instead, writing some_list.append(a) means appending the value referenced by a to the list some_list.
Remember, variables are simply references to a value. Here's the same snippet as above, but with an explanation of what's referencing what.
some_list = [] # the name "some_list" REFERENCES an empty list
a = 2 # the name "a" REFERENCES the integer value 2
some_list.append(a) # we append the value REFERENCED BY "a"
# (the integer 2) to the list REFERENCED
# BY "some_list". That list is not empty
# anymore, holding the value [2]
a = 3 # the name "a" now REFERENCES the integer value 3. This
# has no implications on the list REFERENCED BY "some_list".
# We simply move the "arrow" that pointed the name "a" to
# the value 2 to its new value of 3
print(some_list) # output: [2]
print(a) # output: 3
The key aspect to understand here is that variables are simply references to a value. Writing some_list.append(a) does not mean "place the variable a into the list" but rather "place the value that the variable a references at this moment in time into the list". Variables cannot keep track of other variables, only the values that they are a reference to.
This becomes even clearer if we append to some_list a second time, after modifying the value that a references:
some_list = []
a = 2
some_list.append(a)
a = 3
some_list.append(a)
print(some_list) # output: [2, 3]
print(a) # output: 3
In Python, when you store a list in variable you don't store the list itself, but a reference to a list somewhere in the computer's RAM. If you say
a = [0, 1, 2]
b = a
c = 3
then both a and b will be references to the same list as you set b to a, which is a reference to a list. Then, modifying a will modify b and vice-versa. c, however, is an integer; it works differently. It's like that:
┌───┐
a │ █━┿━━━━━━━━━━━━━━━┓ ┌───┬───┬───┐
└───┘ ┠→│ 0 │ 1 │ 2 │
┌───┐ ┃ └───┴───┴───┘
b │ █━┿━━━━━━━━━━━━━━━┛
└───┘
┌───┐
c │ 3 │
└───┘
a and b are references to a same list, but c is an pointer to an integer which is copied (the integer) when you say d = c. The reference to it, however, is not copied.
So, let's go back to your program. When you say inner.append(lis[n]) you add the value at the end of the list inner. You don't add the reference to the item #2 of the list lis but you create a copy of the value itself and add to the list the reference to this copy!
If you modify lis, then it will have an impact only on variables that are references to lis.
If you want inner to be modified if you modify lis, then replace the inner.append(lis[n])s by inner.append(lis).

Why does b+=(4,) work and b = b + (4,) doesn't work when b is a list?

If we take b = [1,2,3] and if we try doing: b+=(4,)
It returns b = [1,2,3,4], but if we try doing b = b + (4,) it doesn't work.
b = [1,2,3]
b+=(4,) # Prints out b = [1,2,3,4]
b = b + (4,) # Gives an error saying you can't add tuples and lists
I expected b+=(4,) to fail as you can't add a list and a tuple, but it worked. So I tried b = b + (4,) expecting to get the same result, but it didn't work.
The problem with "why" questions is that usually they can mean multiple different things. I will try to answer each one I think you might have in mind.
"Why is it possible for it to work differently?" which is answered by e.g. this. Basically, += tries to use different methods of the object: __iadd__ (which is only checked on the left-hand side), vs __add__ and __radd__ ("reverse add", checked on the right-hand side if the left-hand side doesn't have __add__) for +.
"What exactly does each version do?" In short, the list.__iadd__ method does the same thing as list.extend (but because of the language design, there is still an assignment back).
This also means for example that
>>> a = [1,2,3]
>>> b = a
>>> a += [4] # uses the .extend logic, so it is still the same object
>>> b # therefore a and b are still the same list, and b has the `4` added
[1, 2, 3, 4]
>>> b = b + [5] # makes a new list and assigns back to b
>>> a # so now a is a separate list and does not have the `5`
[1, 2, 3, 4]
+, of course, creates a new object, but explicitly requires another list instead of trying to pull elements out of a different sequence.
"Why is it useful for += to do this? It's more efficient; the extend method doesn't have to create a new object. Of course, this has some surprising effects sometimes (like above), and generally Python is not really about efficiency, but these decisions were made a long time ago.
"What is the reason not to allow adding lists and tuples with +?" See here (thanks, #splash58); one idea is that (tuple + list) should produce the same type as (list + tuple), and it's not clear which type the result should be. += doesn't have this problem, because a += b obviously should not change the type of a.
They are not equivalent:
b += (4,)
is shorthand for:
b.extend((4,))
while + concatenates lists, so by:
b = b + (4,)
you're trying to concatenate a tuple to a list
When you do this:
b += (4,)
is converted to this:
b.__iadd__((4,))
Under the hood it calls b.extend((4,)), extend accepts an iterator and this why this also work:
b = [1,2,3]
b += range(2) # prints [1, 2, 3, 0, 1]
but when you do this:
b = b + (4,)
is converted to this:
b = b.__add__((4,))
accept only list object.
From the official docs, for mutable sequence types both:
s += t
s.extend(t)
are defined as:
extends s with the contents of t
Which is different than being defined as:
s = s + t # not equivalent in Python!
This also means any sequence type will work for t, including a tuple like in your example.
But it also works for ranges and generators! For instance, you can also do:
s += range(3)
The "augmented" assignment operators like += were introduced in Python 2.0, which was released in October 2000. The design and rationale are described in PEP 203. One of the declared goals of these operators was the support of in-place operations. Writing
a = [1, 2, 3]
a += [4, 5, 6]
is supposed to update the list a in place. This matters if there are other references to the list a, e.g. when a was received as a function argument.
However, the operation can't always happen in place, since many Python types, including integers and strings, are immutable, so e.g. i += 1 for an integer i can't possibly operate in place.
In summary, augmented assignment operators were supposed to work in place when possible, and create a new object otherwise. To facilitate these design goals, the expression x += y was specified to behave as follows:
If x.__iadd__ is defined, x.__iadd__(y) is evaluated.
Otherwise, if x.__add__ is implemented x.__add__(y) is evaluated.
Otherwise, if y.__radd__ is implemented y.__radd__(x) is evaluated.
Otherwise raise an error.
The first result obtained by this process will be assigned back to x (unless that result is the NotImplemented singleton, in which case the lookup continues with the next step).
This process allows types that support in-place modification to implement __iadd__(). Types that don't support in-place modification don't need to add any new magic methods, since Python will automatically fall back to essentially x = x + y.
So let's finally come to your actual question – why you can add a tuple to a list with an augmented assignment operator. From memory, the history of this was roughly like this: The list.__iadd__() method was implemented to simply call the already existing list.extend() method in Python 2.0. When iterators were introduced in Python 2.1, the list.extend() method was updated to accept arbitrary iterators. The end result of these changes was that my_list += my_tuple worked starting from Python 2.1. The list.__add__() method, however, was never supposed to support arbitrary iterators as the right-hand argument – this was considered inappropriate for a strongly typed language.
I personally think the implementation of augmented operators ended up being a bit too complex in Python. It has many surprising side effects, e.g. this code:
t = ([42], [43])
t[0] += [44]
The second line raises TypeError: 'tuple' object does not support item assignment, but the operation is successfully performed anyway – t will be ([42, 44], [43]) after executing the line that raises the error.
Most people would expect X += Y to be equivalent to X = X + Y. Indeed, the Python Pocket Reference (4th ed) by Mark Lutz says on page 57 "The following two formats are roughly equivalent: X = X + Y , X += Y". However, the people who specified Python did not make them equivalent. Possibly that was a mistake which will result in hours of debugging time by frustrated programmers for as long as Python remains in use, but it's now just the way Python is. If X is a mutable sequence type, X += Y is equivalent to X.extend( Y ) and not to X = X + Y.
As it's explained here, if array doesn't implement __iadd__ method, the b+=(4,) would be just a shorthanded of b = b + (4,) but obviously it's not, so array does implement __iadd__ method. Apparently the implementation of __iadd__ method is something like this:
def __iadd__(self, x):
self.extend(x)
However we know that the above code is not the actual implementation of __iadd__ method but we can assume and accept that there's something like extend method, which accepts tupple inputs.

How to completely delete list items from memory

I need a way to completely delete list items from memory.
The point is that even if an object is in more than one list, when I delete it it will no longer be available in any list and it will be completely deleted from memory.
I tried del, but it doesn't work:
a = [Object(), Object()]
b = a[1]
del a[1]
but the value of a [1] is still in memory and b != None
How do I delete it completely from memory?
In Python, assignment operator binds the result of the right-hand side expression to the name from the left-hand side expression.
So when you define this
a = [Object(), Object()]
b = a[1]
del a[1]
a is an object and the b is another object. When you are defining b = a[1] it gives the same reference since the values are the same. but when you are deleting the a[1] simply it disconnect the connection between the a[1] and the value. but still, b is connected to that value so it remains.
Link that visualizes evverything

Python Referenced For Loop

I'm playing with for loops in Python and trying to get used to the way they handle variables.
Take the following piece for code:
a=[1,2,3,4,5]
b=a
b[0]=6
After doing this, the zeroth element of both b and a should be 6. The = sign points a reference at the array, yes?
Now, I take a for loop:
a=[1,2,3,4,5]
for i in a:
i=6
My expectation would be that every element of a is now 6, because I would imagine that i points to the elements in a rather than copying them; however, this doesn't seem to be the case.
Clarification would be appreciated, thanks!
Everything in python is treated like a reference. What happens when you do b[0] = 6 is that you assign the 6 to an appropriate place defined by LHS of that expression.
In the second example, you assign the references from the array to i, so that i is 1, then 2, then 3, ... but i never is an element of the array. So when you assign 6 to it, you just change the thing i represents.
http://docs.python.org/reference/datamodel.html is an interesting read if you want to know more about the details.
That isn't how it works. The for loop is iterating through the values of a. The variable i actually has no sense of what is in a itself. Basically, what is happening:
# this is basically what the loop is doing:
# beginning of loop:
i = a[0]
i = 6
# next iteration of for loop:
i = a[1]
i = 6
# next iteration of for loop:
i = a[2]
i = 6
# you get the idea.
At no point does the value at the index change, the only thing to change is the value of i.
You're trying to do this:
for i in xrange(len(a)):
a[i] = 6 # assign the value at index i
Just as you said, "The = sign points a reference". So your loop just reassigns the 'i' reference to 5 different numbers, each one in turn.

Reason for unintuitive UnboundLocalError behaviour 2 [duplicate]

This question already has answers here:
Python difference between mutating and re-assigning a list ( _list = and _list[:] = )
(3 answers)
Closed 19 days ago.
Following up on Reason for unintuitive UnboundLocalError behaviour (I will assume you've read it).
Consider the following Python script:
def f():
# a+=1 # 1
aa=a
aa+=1
# b+='b' # 2
bb=b
bb+='b'
c[0]+='c' # 3
c.append('c')
cc=c
cc.append('c')
d['d']=5 # Update 1
d['dd']=6 # Update 1
dd=d # Update 1
dd['ddd']=7 # Update 1
e.add('e') # Update 2
ee=e # Update 2
ee.add('e') # Update 2
a=1
b='b'
c=['c']
d={'d':4} # Update 1
e=set(['e']) # Update 2
f()
print a
print b
print c
print d # Update 1
print e # Update 2
The result of the script is:
1
b
['cc', 'c', 'c']
{'dd': 6, 'd': 5, 'ddd': 7}
set(['e'])
The commented out lines (marked 1,2) are lines that would through an UnboundLocalError and the SO question I referenced explains why. However, the line marked 3 works!
By default, lists are copied by reference in Python, therefore it's understandable that c changes when cc changes. But why should Python allow c to change in the first place, if it didn't allow changes to a and b directly from the method's scope?
I don't see how the fact that by default lists are copied by reference in Python should make this design decision inconsistent.
What am I missing folks?
UPDATES:
For completeness I also added the dictionary equivalent to the question above, i.e. I added the source code and marked the update with # Update
For further completeness I also added the set equivalent. The set's behavior is actually surprisingly for me. I expected it to act similar to list and dictionary...
Unlike strings and integers, lists in Python are mutable objects. This means they are designed to be changed. The line
c[0] += 'c'
is identical to saying
c.__setitem__(0, c.__getitem__(0) + 'c')
which doesn't make any change to what the name c is bound to. Before and after this call, c is the same list – it's just the contents of this list that have changed.
Had you said
c += ['c']
c = [42]
in the function f(), the same UnboundLocalError would have occured, because the second line makes c a local name, and the first line translates to
c = c + ['c']
requiring the name c to be already bound to something, which (in this local scope) it isn't yet.
The important thing to think about is this: what object does a (or b or c) refer to? The line a += 1 is changing which integer a refers to. Integers are immutable, so when a changes from 1 to 2, it's really the same as a = a + 1, which is giving a an entirely new integer to refer to.
On the other hand, c[0] += 'c' doesn't change which list c refers to, it merely changes which string its first element refers to. Lists are mutable, so the same list can be modified without changing its identity.

Categories

Resources