I have something like that:
a = [instance1, instance2, ...]
if I do a
del a[1]
instance2 is removed from list, but is instance2 desctructor method called?
I'm interested in this because my code uses a lot of memory and I need to free memory deleting instances from a list.
Coming from a language like c++ (as I did), this tends to be a subject many people find difficult to grasp when first learning Python.
The bottomline is this: when you do del XXX, you are never* deleting an object when you use del. You are only deleting an object reference. However, in practice, assuming there are no other references laying about to the instance2 object, deleting it from your list will free the memory as you desire.
If you don't understand the difference between an object and an object reference, read on.
Python: Pass by value, or pass by reference?
You are likely familiar with the concept of passing arguments to a function by reference, or by value. However, Python does things differently. Arguments are always passed by object reference. Read this article for a helpful explanation of what this means.
To summarize: this means that when you pass a variable to a function, you are not passing a copy of the value of the variable (pass by value), nor are you passing the object itself - i.e., the address of the value in memory. You are passing the name-object that indirectly refers to the value held in memory.
What does this have to do with del...?
Well, I'll tell you.
Say you do this:
def deleteit(thing):
del thing
mylist = [1,2,3]
deleteit(mylist)
...what do you think will happen? Has mylist been deleted from the global namespace?
The answer is NO:
assert mylist # No error
The reason is that when you do del thing in the deleteit function, you are only deleting a local object reference to the object. That object reference exists ONLY inside of the function. As a sidebar, you might ask: is it possible to delete an object reference from the global namespace while inside a function? The answer is yes, but you have to declare the object reference to be part of the global namespace first:
def deletemylist():
global mylist
del mylist
mylist = [1,2,3]
deletemylist()
assert mylist #ERROR, as expected
Putting it all together
Now to get back to your original question. When, in ANY namespace, you do this:
del XXX
...you have NOT deleted the object signified by XXX. You CAN'T do that. You have only deleted the object reference XXX, which refers to some object in memory. The object itself is managed by the Python memory manager. This is a very important distinction.
Note that as a consequence, when you override the __del__ method of some object, which gets called when the object is deleted (NOT the object reference!):
class MyClass():
def __del__(self):
print(self, "deleted")
super().__del__()
m = MyClass()
del m
...the print statement inside the __del__ method does not necessarily occur immediately after you do del m. It only occurs at the point in time the object itself is deleted, and that is not up to you. It is up to the Python memory manager. When all object references in all the namespaces have been deleted, the __del__ method will eventually be executed. But not necessarily immediately.
The same is true when you delete an object reference that is part of a list, like in the original example. When you do del a[1], only the object reference to the object signified by a[1] is deleted, and the __del__ method of that object may or may not be called immediately (though as stated before, it will eventually be called once there are no more references to the object, and the object is garbage collected by the memory manager).
As a result of this, it is not recommended that you put things in the __del__ method that you want to happen immediately upon del mything, because it may not happen that way.
*I believe it is never. Inevitably someone will likely downvote my answer and leave a comment discussing the exception to the rule. But whatevs.
No. Calling del on a list element only removes a reference to an object from the list, it doesn't do anything (explicitly) to the object itself. However: If the reference in the list was the last one referring to the object, the object can now be destroyed and recycled. I think that the "normal" CPython implementation will immediately destroy and recycle the object, other variants' behaviour can vary.
If your object is resource-heavy and you want to be sure that the resources are freed correctly, use the with() construct. It's very easy to leak resources when relying on destructors. See this SO post for more details.
Related
I am writing a python class like this:
class MyImageProcessor:
def __init__ (self, image, metadata):
self.image=image
self.metadata=metadata
Both image and metadata are objects of a class written by a
colleague. Now I need to make sure there is no waste of memory. I am thinking of defining a quit() method like this,
def quit():
self.image=None
self.metadata=None
import gc
gc.collect()
and suggest users to call quit() systematically. I would like to know whether this is the right way. In particular, do the instructions in quit() above guarantee that unused memories being well collected?
Alternatively, I could rename quit() to the build-in __exit__(), and suggest users to use the "with" syntax. But my question is
more about whether the instructions in quit() indeed fulfill the garbage collection work one would need in this situation.
Thank you for your help.
In python every object has a built-in reference_count, the variables(names) you create are only pointers to the objects. There are mutable and unmutable variables (for example if you change the value of an integer, the name will be pointed to another integer object, while changing a list element will not cause changing of the list name).
Reference count basically counts how many variable uses that data, and it is incremented/decremented automatically.
The garbage collector will destroy the objects with zero references (actually not always, it takes extra steps to save time). You should check out this article.
Similarly to object constructors (__init__()), which are called on object creation, you can define destructors (__del__()), which are executed on object deletion (usually when the reference count drops to 0). According to this article, in python they are not needed as much needed in C++ because Python has a garbage collector that handles memory management automatically. You can check out those examples too.
Hope it helps :)
No need for quit() (Assuming you're using C-based python).
Python uses two methods of garbage collection, as alluded to in the other answers.
First, there's reference counting. Essentially each time you add a reference to an object it gets incremented & each time you remove the reference (e.g., it goes out of scope) it gets decremented.
From https://devguide.python.org/garbage_collector/:
When an object’s reference count becomes zero, the object is deallocated. If it contains references to other objects, their reference counts are decremented. Those other objects may be deallocated in turn, if this decrement makes their reference count become zero, and so on.
You can get information about current reference counts for an object using sys.getrefcount(x), but really, why bother.
The second way is through garbage collection (gc). [Reference counting is a type of garbage collection, but python specifically calls this second method "garbage collection" -- so we'll also use this terminology. ] This is intended to find those places where reference count is not zero, but the object is no longer accessible. ("Reference cycles") For example:
class MyObj:
pass
x = MyObj()
x.self = x
Here, x refers to itself, so the actual reference count for x is more than 1. You can call del x but that merely removes it from your scope: it lives on because "someone" still has a reference to it.
gc, and specifically gc.collect() goes through objects looking for cycles like this and, when it finds an unreachable cycle (such as your x post deletion), it will deallocate the whole lot.
Back to your question: You don't need to have a quit() object because as soon as your MyImageProcessor object goes out of scope, it will decrement reference counters for image and metadata. If that puts them to zero, they're deallocated. If that doesn't, well, someone else is using them.
Your setting them to None first, merely decrements the reference count right then, but when MyImageProcessor goes out of scope, it won't decrement those reference count again, because MyImageProcessor no longer holds the image or metadata objects! So you're just explicitly doing what python does for you already for free: no more, no less.
You didn't create a cycle, so your calling gc.collect() is unlikely to change anything.
Check out https://devguide.python.org/garbage_collector/ if you are interested in more earthy details.
Not sure if it make sense but to my logic you could
Use :
gc.get_count()
before and after
gc.collect()
to see if something has been removed.
what are count0, count1 and count2 values returned by the Python gc.get_count()
I am creating a program that spawns objects randomly. These objects have a limited lifetime.
I create these objects and place them in a list. The objects keep track of how long they exist and eventually expire. They are no longer needed after expiration.
I would like to delete the objects after they expire but I'm not sure how to reference the specific object in the list to delete it.
if something:
list.append(SomeObject())
---- later---
I would like a cleanup process that looks at the variable in the Object and if it is expired, then remove it from the list.
Thanks for your help in advance.
You can use the refCount in case you define "no longer used" as "no other object keeps a reference". Which is a good way, for as no references exist, the object can no longer be accessed and may be disposed of. In fact, Python's garbage collector will do that for you.
Where it goes wrong is when you also have all the instances in a list. That also counts as a refeference to the object and it therefore never will be disposed of.
For example, a list of state variables that are not only referenced by their owning objects, but also by a list to allow linear access. Explicitly call a cleanup function from the accessor to keep the list clean:
GlobalStateList = []
def gcOnGlobalStateList():
for s in reversed(GlobalStateList):
if (getrefcount(s) <= 3): # 1=GlobalStateList, 2=Iterator, 3=getrefcount()
GlobalStateList.remove(s)
def getGlobalStateList():
gcOnGlobalStateList()
return GlobalStateList
Note that even looking at the refcount increases it, so the test-value is three or less.
Assuming that your concept of SomeObject "expiry" is not directly related to the underlying Python object lifetime (reference counting, etc.) I would suggest that the easiest way to purge the list is to occasionally run through it, dereferencing any expired objects:
lst = [obj for obj in lst if not obj.expired]
Note that you shouldn't call your own variables list, as this will shadow the built-in.
I think this is the most common question on interviews:
class A:
def __init__(self, name):
self.name = name
def __del__(self):
print self.name,
aa = [A(str(i)) for i in range(3)]
for a in aa:
del a
And so what output of this code and why.
Output will be is nothing and why?
Thats because a is ref on object in list and then we call del method we remove this ref but not object?
There are at least 2 references to the object that a references (variables are references to objects, they are not the objects themselves). There's the one reference inside the list, and then there's the reference "a". When you del a, you remove one reference (the variable a) but not the reference from inside the list.
Also note that Python doesn't guarantee that __del__ will ever be called ...
__del__ gets called when an object is destroyed; this will happen after the last possible reference to the object is removed from the program's accessible memory. Depending on the implementation this might happen immediately or might be after some time.
Your code just removes the local name a from the execution scope; the object remains in the list so is still accessible. Try writing del aa[0], for example.
From the docs:
Note del x doesn’t directly call x.__del__() — the former decrements the reference count for x by one, and the latter is only called when x‘s reference count reaches zero.
__del__ is triggered when the garbage collector finds an object to be destroyed. The garbage collector will try to destroy objects with a reference count of 0. del just decouples the label in the local namespace, thereby decrementing the reference count for the object in the interpreter. The behavior of the garbage collector is for the most part considered an implementation detail of the interpreter, so there's no guarantee that __del__ on objects will be called in any specific order, or even at all. That's why the behavior of this code is undefined.
Can anyone describe me why this code will print '2 1 0 done' instead of expected output '0 1 2 done'?
As i can understand, we have some anonymous variables creating during list comprehensions, and they are garbage-collected, using filo principle, on list comprehension uncoverage end.
But, they still are referenced in list aa, aren't they?
Why the second 'del a' is not calling del magic method in that case?
class A:
def __init__(self, name):
self.name = name
def __del__(self):
print self.name,
aa = [A(str(i)) for i in range(3)]
for a in aa:
del a
print 'done'
Also, advanced questions. Please look at http://codepad.org/niaUzGEy
Why there are 5 copies, 3 copies? Musn't this be 1 copy? Why 5 or 3? Dont know, thats why asking it ;)
Thanks for your time!
You are confusing the del statement and the __del__ method.
del a simply unbinds the name a from whatever object it referenced. The list referenced by aa is unchanged so the objects all continue to exist.
The __del__ method is only called after the last reference to an object has been destroyed. That could be after a call to __del__ but usually isn't.
You rarely need to use del. It would be much more common just to rebind aa and then all the objects it contains will be released, and if not otherwise referenced their __del__ methods will be called automatically.
Also, you rarely need to use __del__. For most purposes Python's management of objects will handle cleanup automatically. Adding a __del__ method to a class is generally a bad idea as it can interfere with the garbage collector, so rather paradoxically __del__ makes it more likely that your program will leak memory. Also Python won't guarantee whether __del__ is actually called on program exit, and if it does you may find global variables you cant to use no longer exist, nor will it guarantee to only call it once (though you have to jump through hoops to make it call it more than once).
In short, avoid using __del__ if you possibly can.
It prints done 2 1 0(CPython).
You don't delete list elements in a for loop. They are deleted on exit. As far as I know call order of __del__ is implementation-specific, so it can be different in another implementations(IronPython, Jython etc.)
class A:
def __get(self):
return self._x
def __set(self, y):
self._x = y
def __delete_x(self):
print('DELETING')
del self._x
x = property(__get,__set,__delete_x)
b = A()
# Here, when b is deleted, i'd like b.x to be deleted, i.e __delete_x()
# called (and for immediate consequence, "DELETING" printed)
del b
The semantics of the del statement don't really lend themselves to what you want here. del b simple removes the reference to the A object you just instantiated from the local scope frame / dictionary; this does not directly cause any operation to be performed on the object itself. If that was the last reference to the object, then the reference count dropping to zero, or the garbage collector collecting a cycle, may cause the object to be deallocated. You could observe this by adding a __del__ method to the object, or by adding a weakref callback that performs the desired actions.
Neither of the latter two solutions seems like a great idea, though; __del__ methods prevent the garbage collector from collecting any cycles involving the object; and while weakrefs do not suffer from this problem, in either case you may be running in a strange environment (such as during program shutdown), which may make it difficult to get done what you want to accomplish.
If you can expand on your exact use case, it may be that there is an entirely different approach to accomplishing your desired end goal, but it is difficult to speculate based on such a general and limited example.
To control what happens when an instance of class A goes away (whether by being deleted or garbage collected), you can implement special method __del__(self) in A. If you want to have your code involved when a specific attribute of that instance goes away, you can either wrap that attribute with a wrapper class which has __del__, or, probably better in most cases, use the weakref module (however, not all types are subject to being target of weak references, so you may also need some wrapping for this case).
Avoiding __del__ is generally preferable, if you possibly can, because it can interfere with garbage collection and thereby cause "memory leaks" if and when you have circular references.
An ugly way to do it would be :
def __del__(self):
for x in dir(self.__class__):
if type(getattr(self.__class__, x)) == property:
getattr(self.__class__, x).fdel(self)