How to completely delete list items from memory

How to completely delete list items from memory - python

I need a way to completely delete list items from memory.
The point is that even if an object is in more than one list, when I delete it it will no longer be available in any list and it will be completely deleted from memory.
I tried del, but it doesn't work:
a = [Object(), Object()]
b = a[1]
del a[1]
but the value of a [1] is still in memory and b != None
How do I delete it completely from memory?

In Python, assignment operator binds the result of the right-hand side expression to the name from the left-hand side expression.
So when you define this
a = [Object(), Object()]
b = a[1]
del a[1]
a is an object and the b is another object. When you are defining b = a[1] it gives the same reference since the values are the same. but when you are deleting the a[1] simply it disconnect the connection between the a[1] and the value. but still, b is connected to that value so it remains.
Link that visualizes evverything

Related

What happens in memory while unpacking a collection?

Say
list1=[4,8,12]
a,b,c=list1
output is a=4,b=8,c=12.
My confusion
Instructor told us that it is not like a gets mapped to 4, b to 8, and c to 12. I didn't understand what he told clearly (although I listened repeatedly to him multiple times). He was telling something like object is created for 4 and a is mapped to 4. But what is the difference between this and what I have presented below in figure?

The thing about your picture that's misleading is that it implies that a, b, and c reference slices of list1. If you change list1, though, you will find that a, b, and c aren't affected by that change.
A better way to draw the picture might be to show 4, 8, and 12 separate from list1:
list1-->[ ][ ][ ]
| | |
V V V
4 8 12
^ ^ ^
| | |
a b c
All of the variables are independent of one another, even though some of them (e.g. list1[0] and a) currently point to the same values.
To put it another way: saying a = list1[0] is saying "evaluate list1[0] and assign a to reference whatever that value is right now", which is not the same as saying "make a be an alias for list1[0]".

Try this:
# define the list
list1=[4,8,12]
# reserve 3 memory spaces and unpack the values from list into them
# those memory spaces will contain one integer each one of the size of
# sys.getsizeof(int()) == 28 bytes (Python 3)
# a,b and c are actually pointers to those memory spaces
a,b,c=list1
print(a,b,c)
# change the first value of the list
list1[0] = 56
print(list1)
# now check that indeed "a" is not the same pointer than "list1[0]"
print(a)
But
You must to be careful with this kind of asignations with lists, try also this:
list2 = list1
print(list1, list2)
# then change any of them
list1 [0] = -1
# check that "list2" is pointing to the same memory address than "list1"
print(list1, list2)

how del list[:] works [duplicate]

This question already has answers here:
Clarify how python del works for lists and slices
(2 answers)
Closed 3 years ago.
list[:] creates a copy of the list then why does del list[:] remove all the list items?
Shouldn't it delete the copy of the list?

The answer is no, it shouldn't. It's intended to delete all of the elements of the list. Looking at the documentation, the result of s.clear() (where s is a mutable sequence type, for example a list) is:
removes all items from s (same as del s[:])
Hence, del s[:] is the same as s.clear() in that it removes all items from s.
Perhaps, this is a bit more understandable if you consider that the function called behind the scenes is __delitem__. From the docs:
Called to implement deletion of self[key]. Same note as for __getitem__(). This should only be implemented for mappings if the objects support removal of keys, or for sequences if elements can be removed from the sequence. The same exceptions should be raised for improper key values as for the __getitem__() method.
Consider the following difference:
a = [1,2,3]
b = a
del a
# print(a) ## raises an error
print(b) ## prints [1,2,3]
c = [1,2,3]
d = c
del c[:]
print(c) ## prints []
print(d) ## prints []
So why would you want del a[:] to behave this way? Well, think of it as just a special case of deleting a slice of a list. For example, say that you'd want to delete the 3rd, 4th, and 5th element of a long list a = list(range(40)). With the slice notation and the __delitem__ this is easy, just use del a[3:6]. Now try to do the same with a for loop and you'll soon find out it can get quite cumbersome. Heck, just try to delete all the items of a (but not the a itself!) with a for loop ;)

Because a[:] is is simply just a copy of a, consisting of the identical objects as a but a different object than a. Their elements are identical but they are not.
Let's create a list and do some id checks:
a = [1, 2, 3]
print(id(a)) # 97731118088
print(id(a[:])) # 97731213576
print(id(a[:])) # 97731212104
print(id(a[:])) # 97731198600
Notice the changing ids of the copy a[:]. It is an object that is created on the fly every time it is called, and most importantly, even a[:] is a[:] not True! Let's look at the ids of their elements to come to a conclusion:
print(id(a[0])) # 1648192992
print(id(a[:][0])) # 1648192992
a[0] is a[:][0] # True
a[1] is a[:][1] # True
a[2] is a[:][2] # True
So, we can conclude that a[:] is an object that consists of the very elements of a but is a different object than a, and is a different object every time it is called. The elements of a and a[:] are all identical, but they themselves are not.
So what del a[:] does is to remove all of the elements of a. That way a is mutated, and you end up with an empty a, i.e. []. However, del a removes the name a from namespace completely, and when you ask Python to print a for you, you'll get a NameError: name 'a' is not defined.
But how do we know that? Well, let's gain some perspective by disassembling del on a vs a[:]:
Let's define two functions:
def deletion1(a):
del a
def deletion2(a):
del a[:]
Let's disassemble them:
import dis
dis.dis(x = deletion1)
1 0 DELETE_FAST 0 (a)
2 LOAD_CONST 0 (None)
4 RETURN_VALUE
dis.dis(x = deletion2)
1 0 LOAD_FAST 0 (a)
2 LOAD_CONST 0 (None)
4 LOAD_CONST 0 (None)
6 BUILD_SLICE 2
8 DELETE_SUBSCR
10 LOAD_CONST 0 (None)
12 RETURN_VALUE
The dis documentation indicates that the DELETE_FAST operation, which the first function does, simply "Deletes local co_varnames[var_num]". This is basically removal of that name a so that you can't reach the list object anymore. Beware, this does not remove the object that is referred to by the name a, but just removes its name so that the a is not a reference to anything anymore. The object 97731118088 is still the same list, [1, 2, 3]:
import gc
for obj in gc.get_objects():
if id(obj) == 97731118088:
print(obj)
# [1, 2, 3]
On the other hand, again from the documentation, DELETE_SUBSCR "Implements del TOS1[TOS]", which is basically "an in-place operator that removes the top of the stack and pushes the result back on the stack". This way, the stack elements are removed, and you are left with the name a, which now refers to a list whose elements are deleted, i.e. just an "empty shell", if you will. After this operation, a becomes [], but has still the same id of 97731118088. Just its elements are gone via an in-place deletion.

Why does b+=(4,) work and b = b + (4,) doesn't work when b is a list?

If we take b = [1,2,3] and if we try doing: b+=(4,)
It returns b = [1,2,3,4], but if we try doing b = b + (4,) it doesn't work.
b = [1,2,3]
b+=(4,) # Prints out b = [1,2,3,4]
b = b + (4,) # Gives an error saying you can't add tuples and lists
I expected b+=(4,) to fail as you can't add a list and a tuple, but it worked. So I tried b = b + (4,) expecting to get the same result, but it didn't work.

The problem with "why" questions is that usually they can mean multiple different things. I will try to answer each one I think you might have in mind.
"Why is it possible for it to work differently?" which is answered by e.g. this. Basically, += tries to use different methods of the object: __iadd__ (which is only checked on the left-hand side), vs __add__ and __radd__ ("reverse add", checked on the right-hand side if the left-hand side doesn't have __add__) for +.
"What exactly does each version do?" In short, the list.__iadd__ method does the same thing as list.extend (but because of the language design, there is still an assignment back).
This also means for example that
>>> a = [1,2,3]
>>> b = a
>>> a += [4] # uses the .extend logic, so it is still the same object
>>> b # therefore a and b are still the same list, and b has the `4` added
[1, 2, 3, 4]
>>> b = b + [5] # makes a new list and assigns back to b
>>> a # so now a is a separate list and does not have the `5`
[1, 2, 3, 4]
+, of course, creates a new object, but explicitly requires another list instead of trying to pull elements out of a different sequence.
"Why is it useful for += to do this? It's more efficient; the extend method doesn't have to create a new object. Of course, this has some surprising effects sometimes (like above), and generally Python is not really about efficiency, but these decisions were made a long time ago.
"What is the reason not to allow adding lists and tuples with +?" See here (thanks, #splash58); one idea is that (tuple + list) should produce the same type as (list + tuple), and it's not clear which type the result should be. += doesn't have this problem, because a += b obviously should not change the type of a.

They are not equivalent:
b += (4,)
is shorthand for:
b.extend((4,))
while + concatenates lists, so by:
b = b + (4,)
you're trying to concatenate a tuple to a list

When you do this:
b += (4,)
is converted to this:
b.__iadd__((4,))
Under the hood it calls b.extend((4,)), extend accepts an iterator and this why this also work:
b = [1,2,3]
b += range(2) # prints [1, 2, 3, 0, 1]
but when you do this:
b = b + (4,)
is converted to this:
b = b.__add__((4,))
accept only list object.

From the official docs, for mutable sequence types both:
s += t
s.extend(t)
are defined as:
extends s with the contents of t
Which is different than being defined as:
s = s + t # not equivalent in Python!
This also means any sequence type will work for t, including a tuple like in your example.
But it also works for ranges and generators! For instance, you can also do:
s += range(3)

The "augmented" assignment operators like += were introduced in Python 2.0, which was released in October 2000. The design and rationale are described in PEP 203. One of the declared goals of these operators was the support of in-place operations. Writing
a = [1, 2, 3]
a += [4, 5, 6]
is supposed to update the list a in place. This matters if there are other references to the list a, e.g. when a was received as a function argument.
However, the operation can't always happen in place, since many Python types, including integers and strings, are immutable, so e.g. i += 1 for an integer i can't possibly operate in place.
In summary, augmented assignment operators were supposed to work in place when possible, and create a new object otherwise. To facilitate these design goals, the expression x += y was specified to behave as follows:
If x.__iadd__ is defined, x.__iadd__(y) is evaluated.
Otherwise, if x.__add__ is implemented x.__add__(y) is evaluated.
Otherwise, if y.__radd__ is implemented y.__radd__(x) is evaluated.
Otherwise raise an error.
The first result obtained by this process will be assigned back to x (unless that result is the NotImplemented singleton, in which case the lookup continues with the next step).
This process allows types that support in-place modification to implement __iadd__(). Types that don't support in-place modification don't need to add any new magic methods, since Python will automatically fall back to essentially x = x + y.
So let's finally come to your actual question – why you can add a tuple to a list with an augmented assignment operator. From memory, the history of this was roughly like this: The list.__iadd__() method was implemented to simply call the already existing list.extend() method in Python 2.0. When iterators were introduced in Python 2.1, the list.extend() method was updated to accept arbitrary iterators. The end result of these changes was that my_list += my_tuple worked starting from Python 2.1. The list.__add__() method, however, was never supposed to support arbitrary iterators as the right-hand argument – this was considered inappropriate for a strongly typed language.
I personally think the implementation of augmented operators ended up being a bit too complex in Python. It has many surprising side effects, e.g. this code:
t = ([42], [43])
t[0] += [44]
The second line raises TypeError: 'tuple' object does not support item assignment, but the operation is successfully performed anyway – t will be ([42, 44], [43]) after executing the line that raises the error.

Most people would expect X += Y to be equivalent to X = X + Y. Indeed, the Python Pocket Reference (4th ed) by Mark Lutz says on page 57 "The following two formats are roughly equivalent: X = X + Y , X += Y". However, the people who specified Python did not make them equivalent. Possibly that was a mistake which will result in hours of debugging time by frustrated programmers for as long as Python remains in use, but it's now just the way Python is. If X is a mutable sequence type, X += Y is equivalent to X.extend( Y ) and not to X = X + Y.

As it's explained here, if array doesn't implement __iadd__ method, the b+=(4,) would be just a shorthanded of b = b + (4,) but obviously it's not, so array does implement __iadd__ method. Apparently the implementation of __iadd__ method is something like this:
def __iadd__(self, x):
self.extend(x)
However we know that the above code is not the actual implementation of __iadd__ method but we can assume and accept that there's something like extend method, which accepts tupple inputs.

Python Append vs list+list

I read Python list + list vs. list.append(), which is a similar question, but my question is more in relation to the code below
a = [[]] * 4
b = [[]] * 4
a[3] = a[3] + [1]
b[3].append(1)
print a, b
Which gives:
[[],[],[],[1]] [[1],[1],[1],[1]]
Why would these 2 be any different? I've never run into an example like this where these 2 methods have different outputs...
Thanks

a[3] = a[3] + [1] is not modifying a[3]. Instead, it is putting a new item there. a[3] + [1] creates a list that is just like a[3] except that it has an extra one at the end. Then, a[3] = ... sets a at the index 3 to that new list.
b[3].append(1) accesses b[3] and uses its .append() method. The .append() method works on the list itself and puts a one at the end of the list. Since [[]] * 4 creates a list with four copies of another list, the .append() method reveals its changes in all items of b.

Python Referenced For Loop

I'm playing with for loops in Python and trying to get used to the way they handle variables.
Take the following piece for code:
a=[1,2,3,4,5]
b=a
b[0]=6
After doing this, the zeroth element of both b and a should be 6. The = sign points a reference at the array, yes?
Now, I take a for loop:
a=[1,2,3,4,5]
for i in a:
i=6
My expectation would be that every element of a is now 6, because I would imagine that i points to the elements in a rather than copying them; however, this doesn't seem to be the case.
Clarification would be appreciated, thanks!

Everything in python is treated like a reference. What happens when you do b[0] = 6 is that you assign the 6 to an appropriate place defined by LHS of that expression.
In the second example, you assign the references from the array to i, so that i is 1, then 2, then 3, ... but i never is an element of the array. So when you assign 6 to it, you just change the thing i represents.
http://docs.python.org/reference/datamodel.html is an interesting read if you want to know more about the details.

That isn't how it works. The for loop is iterating through the values of a. The variable i actually has no sense of what is in a itself. Basically, what is happening:
# this is basically what the loop is doing:
# beginning of loop:
i = a[0]
i = 6
# next iteration of for loop:
i = a[1]
i = 6
# next iteration of for loop:
i = a[2]
i = 6
# you get the idea.
At no point does the value at the index change, the only thing to change is the value of i.
You're trying to do this:
for i in xrange(len(a)):
a[i] = 6 # assign the value at index i

Just as you said, "The = sign points a reference". So your loop just reassigns the 'i' reference to 5 different numbers, each one in turn.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to completely delete list items from memory - python

Related

What happens in memory while unpacking a collection?

how del list[:] works [duplicate]

Why does b+=(4,) work and b = b + (4,) doesn't work when b is a list?

Python Append vs list+list

Python Referenced For Loop

Categories

Resources