I think this is the most common question on interviews:
class A:
def __init__(self, name):
self.name = name
def __del__(self):
print self.name,
aa = [A(str(i)) for i in range(3)]
for a in aa:
del a
And so what output of this code and why.
Output will be is nothing and why?
Thats because a is ref on object in list and then we call del method we remove this ref but not object?
There are at least 2 references to the object that a references (variables are references to objects, they are not the objects themselves). There's the one reference inside the list, and then there's the reference "a". When you del a, you remove one reference (the variable a) but not the reference from inside the list.
Also note that Python doesn't guarantee that __del__ will ever be called ...
__del__ gets called when an object is destroyed; this will happen after the last possible reference to the object is removed from the program's accessible memory. Depending on the implementation this might happen immediately or might be after some time.
Your code just removes the local name a from the execution scope; the object remains in the list so is still accessible. Try writing del aa[0], for example.
From the docs:
Note del x doesn’t directly call x.__del__() — the former decrements the reference count for x by one, and the latter is only called when x‘s reference count reaches zero.
__del__ is triggered when the garbage collector finds an object to be destroyed. The garbage collector will try to destroy objects with a reference count of 0. del just decouples the label in the local namespace, thereby decrementing the reference count for the object in the interpreter. The behavior of the garbage collector is for the most part considered an implementation detail of the interpreter, so there's no guarantee that __del__ on objects will be called in any specific order, or even at all. That's why the behavior of this code is undefined.
Related
I have the following function:
def myfn():
big_obj = BigObj()
result = consume(big_obj)
return result
When is the reference count for the value of BigObj() increased / decreased:
Is it:
when consume(big_obj) is called (since big_obj is not referenced afterwards in myfn)
when the function returns
some point, I don't no yet
Would it make a difference to change the last line to:
return consume(big_obj)
Edit (clarification for comments):
A local variable exists until the function returns
the reference can be deleted with del obj
But what is with temporaries (e.g f1(f2())?
I checked references to temporaries with this code:
import sys
def f2(c):
print("f2: References to c", sys.getrefcount(c))
def f0():
print("f0")
f2(object())
def f1():
c = object()
print("f1: References to c", sys.getrefcount(c))
f2(c)
f0()
f1()
This prints:
f0
f2: References to c 3
f1: References to c 2
f2: References to c 4
It seems, that references to temporary variables are held. Not that getrefcount gives one more than you would expect because it holds a reference, too.
When is the reference count for big_obj decreased
big_obj does not have a reference count. Variables don't have reference counts. Values do.
big_obj = BigObj()
This line of code creates an instance of the BigObj class. The reference count for that instance may increase or decrease multiple times, depending on the implementation details of that creation process (which is not necessarily written in Python). Notably, though, the assignment to the name big_obj increases the reference count.
when the function returns
At this point, the name big_obj ceases to exist - the name does not disappear simply because it won't be used again. (That's really hard to detect in the general case, and there isn't a particular benefit to it normally). If you must cause a name to cease to exist at a specific point in the operation (for example, because you know that is the last reference and want to trigger garbage collection; or perhaps because you are doing something tricky with __weakref__) then that is what the del statement is for.
Because a name for the object ceases to exist, its reference count decreases. If and when that count reaches zero, the object is garbage collected. It may have any number of references stored in other places, for a wide variety of reasons. (For example, there could be a bug in C code that implements the class; or the class might deliberately maintain its own list of every instance ever created.)
Note that all of the above pertains specifically to the reference implementation. In other implementations, things will differ. There might be some other trigger for garbage collection to happen. There might not be reference counting at all (as with Jython).
From the comments, it seems like what you are worried about is the potential for a memory leak. The code that you show cannot cause a memory leak - but neither can it fix a memory leak caused elsewhere. In Python, as in garbage-collected languages in general, memory leaks happen because objects hold on to references to each other that aren't needed. But there is no concept of "ownership" or "transfer" of references normally - all you need to do is not do things like "maintain a list of every instance ever created" without a) a good reason and b) a way to take instances out of that list when you want to forget about them.
A local variable, though, by definition, cannot extend the lifetime of an object beyond the local scope.
Disclaimer: Most information is from the comments. So credit for every one who participated in the discussion.
When an object is deleted is an implementation detail in general.
I will refer to CPython, which is based on reference counting. I ran the code examples with CPython 3.10.0.
An object is deleted, when the reference count hits zero.
Returning from a function deletes all local references.
Assigning a name to a new value decreases the reference count of the old value
passing a local increases the reference count. The reference is in on the stack(frame)
Returning from a function removes the reference from the stack
The last point is even valid for temporary references like f(g()). The last reference to g() is deleted, when f returns (assuming that g does not save a reference somewhere)see here
So for the example from the question:
def myfn():
big_obj = BigObj() # reference 1
result = consume(big_obj) # reference 2 on the stack frame for
# consume. Not yet counting any
# reference inside of consume
# after consume returns: The stack frame
# and reference 2 are deleted. Reference
# 1 remains
return result # myfn returns reference 1 is deleted.
# BigObj is deleted
def consume(big_obj):
pass # consume is holding reference 3
If we would change this to:
def myfn():
return consume(BigObj()) # reference is still saved on the stack
# frame and will be only deleted after
# consume returns
def consume(big_obj):
pass # consume is holding reference 2
How can I check reliably, if an object was deleted?
You cannot rely on gc.get_objects(). gc is used to detect and recycle reference cycles. Not every reference is tracked by the gc.
You can create a weak reference and check if the reference is still valid.
class BigObj:
pass
import weakref
ref = None
def make_ref(obj):
global ref
ref = weakref.ref(obj)
return obj
def myfn():
return consume(make_ref(BigObj()))
def consume(obj):
obj = None # remove to see impact on ref count
print(sys.getrefcount(ref()))
print(ref()) # There is still a valid reference. It is the one from consume stack frame
myfn()
How to pass a reference to a function and remove all references in the calling function?
You can box the reference, pass to the function and clear the boxed reference from inside the function:
class Ref:
def __init__(ref):
self.ref = ref
def clear():
self.ref = None
def f1(ref):
r = ref.ref
ref.clear()
def f2():
f1(Ref(object())
Variables have function scope in Python, so they aren't removed until the function returns. As far as I can tell, you can't destroy a reference to a local variable in a function from outside that function. I added some gc calls in the example code to test this.
import gc
class BigObj:
pass
def consume(obj):
del obj # only deletes the local reference to obj, but another one still exists in the calling function
def myfn():
big_obj = BigObj()
big_obj_id = id(big_obj) # in CPython, this is the memory address of the object
consume(big_obj)
print(any(id(obj) == big_obj_id for obj in gc.get_objects()))
return big_obj_id
>>> big_obj_id = myfn()
True
>>> gc.collect() # I'm not sure where the reference cycle is, but I needed to run this to clean out the big object from the gc's list of objects in my shell
>>> print(any(id(obj) == big_obj_id for obj in gc.get_objects()))
False
Since True was printed, the big object still existed after we forced garbage collection to occur even though there were no references to that variable after that point in the function. Forcing garbage collection after the function returns rightfully determines that the reference count to the big object is 0, so it cleans that object up. NOTE: As the comments below point out, ids for deleted objects may be reused so checking for equal ids may result in false positives. However, I'm confident that the conclusion is still correct.
One thing you can do to reclaim that memory earlier is to make the big object global, which could allow you to delete it from within the called function.
def consume():
# do whatever you need to do with the big object
big_obj_id = id(big_obj)
del globals()["big_obj"]
print(any(id(obj) == big_obj_id for obj in gc.get_objects()))
# do anything else you need to do without the big object
def myfn():
globals()["big_obj"] = BigObj()
result = consume()
return result
>>> myfn()
False
This sort of pattern is pretty weird and likely very hard to maintain though, so I would advise against using this. If you only need to delete the big object right after consume() is called, you could do something like this in order to free up the memory used by the big object as soon as possible.
big_obj = BigObj()
consume(big_obj)
del big_obj
Another strategy you might try is deleting the references within the big object that's passed in from the consume() function with del big_obj.x for some attribute x.
I have something like that:
a = [instance1, instance2, ...]
if I do a
del a[1]
instance2 is removed from list, but is instance2 desctructor method called?
I'm interested in this because my code uses a lot of memory and I need to free memory deleting instances from a list.
Coming from a language like c++ (as I did), this tends to be a subject many people find difficult to grasp when first learning Python.
The bottomline is this: when you do del XXX, you are never* deleting an object when you use del. You are only deleting an object reference. However, in practice, assuming there are no other references laying about to the instance2 object, deleting it from your list will free the memory as you desire.
If you don't understand the difference between an object and an object reference, read on.
Python: Pass by value, or pass by reference?
You are likely familiar with the concept of passing arguments to a function by reference, or by value. However, Python does things differently. Arguments are always passed by object reference. Read this article for a helpful explanation of what this means.
To summarize: this means that when you pass a variable to a function, you are not passing a copy of the value of the variable (pass by value), nor are you passing the object itself - i.e., the address of the value in memory. You are passing the name-object that indirectly refers to the value held in memory.
What does this have to do with del...?
Well, I'll tell you.
Say you do this:
def deleteit(thing):
del thing
mylist = [1,2,3]
deleteit(mylist)
...what do you think will happen? Has mylist been deleted from the global namespace?
The answer is NO:
assert mylist # No error
The reason is that when you do del thing in the deleteit function, you are only deleting a local object reference to the object. That object reference exists ONLY inside of the function. As a sidebar, you might ask: is it possible to delete an object reference from the global namespace while inside a function? The answer is yes, but you have to declare the object reference to be part of the global namespace first:
def deletemylist():
global mylist
del mylist
mylist = [1,2,3]
deletemylist()
assert mylist #ERROR, as expected
Putting it all together
Now to get back to your original question. When, in ANY namespace, you do this:
del XXX
...you have NOT deleted the object signified by XXX. You CAN'T do that. You have only deleted the object reference XXX, which refers to some object in memory. The object itself is managed by the Python memory manager. This is a very important distinction.
Note that as a consequence, when you override the __del__ method of some object, which gets called when the object is deleted (NOT the object reference!):
class MyClass():
def __del__(self):
print(self, "deleted")
super().__del__()
m = MyClass()
del m
...the print statement inside the __del__ method does not necessarily occur immediately after you do del m. It only occurs at the point in time the object itself is deleted, and that is not up to you. It is up to the Python memory manager. When all object references in all the namespaces have been deleted, the __del__ method will eventually be executed. But not necessarily immediately.
The same is true when you delete an object reference that is part of a list, like in the original example. When you do del a[1], only the object reference to the object signified by a[1] is deleted, and the __del__ method of that object may or may not be called immediately (though as stated before, it will eventually be called once there are no more references to the object, and the object is garbage collected by the memory manager).
As a result of this, it is not recommended that you put things in the __del__ method that you want to happen immediately upon del mything, because it may not happen that way.
*I believe it is never. Inevitably someone will likely downvote my answer and leave a comment discussing the exception to the rule. But whatevs.
No. Calling del on a list element only removes a reference to an object from the list, it doesn't do anything (explicitly) to the object itself. However: If the reference in the list was the last one referring to the object, the object can now be destroyed and recycled. I think that the "normal" CPython implementation will immediately destroy and recycle the object, other variants' behaviour can vary.
If your object is resource-heavy and you want to be sure that the resources are freed correctly, use the with() construct. It's very easy to leak resources when relying on destructors. See this SO post for more details.
I'm trying to create a class using a static List, which collects all new instances of an object class. The problem I'm facing, seems like as soon as i try to use a list the same way as for example an integer, i can't use the magic marker __del__ anymore.
My Example:
class MyClass(object):
count = 0
#instances = []
def __init__(self, a, b):
self.a = a
self.b = b
MyClass.count += 1
#MyClass.instances.append(self)
def __str__(self):
return self.__repr__()
def __repr__(self):
return "a: " + str(self.a) + ", b: " + str(self.b)
def __del__(self):
MyClass.count -= 1
#MyClass.instances.remove(self)
A = MyClass(1,'abc')
B = MyClass(2,'def')
print MyClass.count
del B
print MyClass.count
With comments I get the correct answer:
2
1
But without the comments - including now the static object list MyClass.instances I get the wrong answer:
2
2
It seems like MyClass can't reach its __del__ method anymore! How Come?
From the docs,
del x doesn’t directly call x.__del__() — the former decrements the reference
count for x by one, and the latter is only called when x‘s reference count
reaches zero.
When you uncomment,
instances = []
...
...
MyClass.instances.append(self)
You are storing a reference to the current Object in the MyClass.instances. That means, the reference count is internally incremented by 1. That is why __del__ is not getting called immediately.
To resolve this problem, explicitly remove the item from the list like this
MyClass.instances.remove(B)
del B
Now it will print
2
1
as expected.
There is one more way to fix this problem. That is to use weakref. From the docs,
A weak reference to an object is not enough to keep the object alive:
when the only remaining references to a referent are weak references,
garbage collection is free to destroy the referent and reuse its
memory for something else. A primary use for weak references is to
implement caches or mappings holding large objects, where it’s desired
that a large object not be kept alive solely because it appears in a
cache or mapping.
So, having a weakref will not postpone object's deletion. With weakref, this can be fixed like this
MyClass.instances.append(weakref.ref(self))
...
...
# MyClass.instances.remove(weakref.ref(self))
MyClass.instances = [w_ref for w_ref in MyClass.instances if w_ref() is None]
Instead of using remove method, we can call each of the weakref objects and if they return None, they are already dead. So, we filter them out with the list comprehension.
So, now, when you say del B, even though weakrefs exist for B, it will call __del__ (unless you have made some other variable point to the same object, like by doing an assigment).
From to http://docs.python.org/2.7/reference/datamodel.html#basic-customization I quote (paragraph in gray after object.__del__):
del x doesn’t directly call x.__del__() — the former decrements the reference count for x by one, and the latter is only called when x‘s reference count reaches zero.
Here you call del B but there is still an instance of B in MyClass.instances, so that B is still referenced and hence not destroyed, so that the __del__ function is not called.
If you call directly B.__del__(), it works.
__del__ is only called when no more instances are left.
You should consider putting only weak refs into the MyClass.instances list.
This can be achieved with import weakref and then
either using a WeakSet for the list
or putting weakref.ref(self) into the list.
__del__ is automatically called whenever the last "strict" reference is removed. The weakrefs disappear automatically.
But be aware that there are some caveats on __del__ mentioned in the docs.
__del__ is used when the garbage collector remove an object from the memory. If you add your object to MyClass.instances then the object is marked as "used" and the garbage collector will never try to remove it. And so __del__ is never called.
You'd better use an explicit function (MyClass.del_element()) because you can't really predict when __del__ will be called (even if you don't add it to a list).
import weakref
import gc
class MyClass(object):
def refer_to(self, thing):
self.refers_to = thing
foo = MyClass()
bar = MyClass()
foo.refer_to(bar)
bar.refer_to(foo)
foo_ref = weakref.ref(foo)
bar_ref = weakref.ref(bar)
del foo
del bar
gc.collect()
print foo_ref()
I want foo_ref and bar_ref to retain weak references to foo and bar respectively as long as they reference each other*, but this instead prints None. How can I prevent the garbage collector from collecting certain objects within reference cycles?
bar should be garbage-collected in this code because it is no longer part of the foo-bar reference cycle:
baz = MyClass()
baz.refer_to(foo)
foo.refer_to(baz)
gc.collect()
* I realize it might seem pointless to prevent circular references from being garbage-collected, but my use case requires it. I have a bunch of objects that refer to each other in a web-like fashion, along with a WeakValueDictionary that keeps a weak reference to each object in the bunch. I only want an object in the bunch to be garbage-collected when it is orphaned, i.e. when no other objects in the bunch refer to it.
Normally using weak references means that you cannot prevent objects from being garbage collected.
However, there is a trick you can use to prevent objects part of a reference cycle from being garbage collected: define a __del__() method on these.
From the gc module documentation:
gc.garbage
A list of objects which the collector found to be unreachable but could not be freed (uncollectable objects). By default, this list
contains only objects with __del__() methods. Objects that have
__del__() methods and are part of a reference cycle cause the entire reference cycle to be uncollectable, including objects not necessarily
in the cycle but reachable only from it. Python doesn’t collect such
cycles automatically because, in general, it isn’t possible for Python
to guess a safe order in which to run the __del__() methods. If you
know a safe order, you can force the issue by examining the garbage
list, and explicitly breaking cycles due to your objects within the
list. Note that these objects are kept alive even so by virtue of
being in the garbage list, so they should be removed from garbage too.
For example, after breaking cycles, do del gc.garbage[:] to empty the
list. It’s generally better to avoid the issue by not creating cycles
containing objects with __del__() methods, and garbage can be examined
in that case to verify that no such cycles are being created.
When you define MyClass as follows:
class MyClass(object):
def refer_to(self, thing):
self.refers_to = thing
def __del__(self):
print 'Being deleted now, bye-bye!'
then your example script prints:
<__main__.MyClass object at 0x108476a50>
but commenting out one of the .refer_to() calls results in:
Being deleted now, bye-bye!
Being deleted now, bye-bye!
None
In other words, by simply having defined a __del__() method, we prevented the reference cycle from being garbage collected, but any orphaned objects are being deleted.
Note that in order for this to work, you need circular references; any object in your object graph that is not part of a reference circle will be picked off regardless.
class example:
def exampleMethod(self):
aVar = 'some string'
return aVar
In this example, how does garbage collection work after each call to example.exampleMethod()? Will aVar be deallocated once the method returns?
The variable is never deallocated.
The object (in this case a string, with a value of 'some string' is reused again and again, so that object can never be deallocated.
Objects are deallocated when no variable refers to the object. Think of this.
a = 'hi mom'
a = 'next value'
In this case, the first object (a string with the value 'hi mom') is no longer referenced anywhere in the script when the second statement is executed. The object ('hi mom') can be removed from memory.
Every time You assign an object to a variable, You increase this object's reference counter.
a = MyObject() # +1, so it's at 1
b = a # +1, so it's now 2
a = 'something else' # -1, so it's 1
b = 'something else' # -1, so it's 0
Noone can access this the MyObject object We have created at the first line anymore.
When the counter reaches zero, the garbage collector frees the memory.
There is a way to make a tricky reference that does not increase reference counter (f.e. if You don't want an object to be hold in memory just because it's in some cache dict).
More on cPython's reference counting can be found here.
Python is language, cPython is it's (quite popular) implementation. Afaik the language itself doesn't specify how the memory is freed.
From your example, if you call example.exampleMethod() , without assigning the results (eg. a = example.exampleMethod()) then it will be deallocated straight away (in CPython), as CPython uses a reference counting mechanism. Strings aren't a very good example to use, because they also have a number of implementation specific optimizations. Strings can be cached, and are not deallocated so that they can be reused. This is especially useful because strings are very common for use as keys in dicts.
Again, garbage collecting is specific to the implementations, so CPython, Jython and IronPython will have different behaviours, most of these being documented on the respective sites/manuals/code/etc. If you want to explore a bit, I'd suggest creating a class where you have defined the del() method, which will be called upon the object being garbage collected (it's the destructor). Make it print something so you can trace it's call :)
As in Nico's answer, it depends on what you do with the result returned by exampleMethod. Python (or CPython anyway) uses reference counting. During the method, aVar references the string, while after that the variable aVar is deleted, which may leave no references, in which case, it is deleted.
Below is an example with a custom class that has a destructor (del(self), that print out "Object 1 being destructed" or similar. The gc is the garbage collector module, that automatically deletes objects with a reference count of 0. It's there for convenience, as otherwise there is no guarantee when the garbage collector is run.
import gc
class Noisy(object):
def __init__(self, n):
self.n = n
def __del__(self):
print "Object " + str(self.n) + " being destructed"
class example(object):
def exampleMethod(self, n):
aVar = Noisy(n)
return aVar
a = example()
a.exampleMethod(1)
b = a.exampleMethod(2)
gc.collect()
print "Before b is deleted"
del b
gc.collect()
print "After b is deleted"
The result should be as follows:
Object 1 being destructed
Before b is deleted
Object 2 being destructed
After b is deleted
Notice that the first Noisy object is deleted after the method is returned, as it is not assigned to a variable, so has a reference count of 0, but the second one is deleted only after the variable b is deleted, leaving a reference count of 0.