Preserving circular references after garbage collection - python

import weakref
import gc
class MyClass(object):
def refer_to(self, thing):
self.refers_to = thing
foo = MyClass()
bar = MyClass()
foo.refer_to(bar)
bar.refer_to(foo)
foo_ref = weakref.ref(foo)
bar_ref = weakref.ref(bar)
del foo
del bar
gc.collect()
print foo_ref()
I want foo_ref and bar_ref to retain weak references to foo and bar respectively as long as they reference each other*, but this instead prints None. How can I prevent the garbage collector from collecting certain objects within reference cycles?
bar should be garbage-collected in this code because it is no longer part of the foo-bar reference cycle:
baz = MyClass()
baz.refer_to(foo)
foo.refer_to(baz)
gc.collect()
* I realize it might seem pointless to prevent circular references from being garbage-collected, but my use case requires it. I have a bunch of objects that refer to each other in a web-like fashion, along with a WeakValueDictionary that keeps a weak reference to each object in the bunch. I only want an object in the bunch to be garbage-collected when it is orphaned, i.e. when no other objects in the bunch refer to it.

Normally using weak references means that you cannot prevent objects from being garbage collected.
However, there is a trick you can use to prevent objects part of a reference cycle from being garbage collected: define a __del__() method on these.
From the gc module documentation:
gc.garbage
A list of objects which the collector found to be unreachable but could not be freed (uncollectable objects). By default, this list
contains only objects with __del__() methods. Objects that have
__del__() methods and are part of a reference cycle cause the entire reference cycle to be uncollectable, including objects not necessarily
in the cycle but reachable only from it. Python doesn’t collect such
cycles automatically because, in general, it isn’t possible for Python
to guess a safe order in which to run the __del__() methods. If you
know a safe order, you can force the issue by examining the garbage
list, and explicitly breaking cycles due to your objects within the
list. Note that these objects are kept alive even so by virtue of
being in the garbage list, so they should be removed from garbage too.
For example, after breaking cycles, do del gc.garbage[:] to empty the
list. It’s generally better to avoid the issue by not creating cycles
containing objects with __del__() methods, and garbage can be examined
in that case to verify that no such cycles are being created.
When you define MyClass as follows:
class MyClass(object):
def refer_to(self, thing):
self.refers_to = thing
def __del__(self):
print 'Being deleted now, bye-bye!'
then your example script prints:
<__main__.MyClass object at 0x108476a50>
but commenting out one of the .refer_to() calls results in:
Being deleted now, bye-bye!
Being deleted now, bye-bye!
None
In other words, by simply having defined a __del__() method, we prevented the reference cycle from being garbage collected, but any orphaned objects are being deleted.
Note that in order for this to work, you need circular references; any object in your object graph that is not part of a reference circle will be picked off regardless.

Related

Call destructor of an element from a list

I have something like that:
a = [instance1, instance2, ...]
if I do a
del a[1]
instance2 is removed from list, but is instance2 desctructor method called?
I'm interested in this because my code uses a lot of memory and I need to free memory deleting instances from a list.
Coming from a language like c++ (as I did), this tends to be a subject many people find difficult to grasp when first learning Python.
The bottomline is this: when you do del XXX, you are never* deleting an object when you use del. You are only deleting an object reference. However, in practice, assuming there are no other references laying about to the instance2 object, deleting it from your list will free the memory as you desire.
If you don't understand the difference between an object and an object reference, read on.
Python: Pass by value, or pass by reference?
You are likely familiar with the concept of passing arguments to a function by reference, or by value. However, Python does things differently. Arguments are always passed by object reference. Read this article for a helpful explanation of what this means.
To summarize: this means that when you pass a variable to a function, you are not passing a copy of the value of the variable (pass by value), nor are you passing the object itself - i.e., the address of the value in memory. You are passing the name-object that indirectly refers to the value held in memory.
What does this have to do with del...?
Well, I'll tell you.
Say you do this:
def deleteit(thing):
del thing
mylist = [1,2,3]
deleteit(mylist)
...what do you think will happen? Has mylist been deleted from the global namespace?
The answer is NO:
assert mylist # No error
The reason is that when you do del thing in the deleteit function, you are only deleting a local object reference to the object. That object reference exists ONLY inside of the function. As a sidebar, you might ask: is it possible to delete an object reference from the global namespace while inside a function? The answer is yes, but you have to declare the object reference to be part of the global namespace first:
def deletemylist():
global mylist
del mylist
mylist = [1,2,3]
deletemylist()
assert mylist #ERROR, as expected
Putting it all together
Now to get back to your original question. When, in ANY namespace, you do this:
del XXX
...you have NOT deleted the object signified by XXX. You CAN'T do that. You have only deleted the object reference XXX, which refers to some object in memory. The object itself is managed by the Python memory manager. This is a very important distinction.
Note that as a consequence, when you override the __del__ method of some object, which gets called when the object is deleted (NOT the object reference!):
class MyClass():
def __del__(self):
print(self, "deleted")
super().__del__()
m = MyClass()
del m
...the print statement inside the __del__ method does not necessarily occur immediately after you do del m. It only occurs at the point in time the object itself is deleted, and that is not up to you. It is up to the Python memory manager. When all object references in all the namespaces have been deleted, the __del__ method will eventually be executed. But not necessarily immediately.
The same is true when you delete an object reference that is part of a list, like in the original example. When you do del a[1], only the object reference to the object signified by a[1] is deleted, and the __del__ method of that object may or may not be called immediately (though as stated before, it will eventually be called once there are no more references to the object, and the object is garbage collected by the memory manager).
As a result of this, it is not recommended that you put things in the __del__ method that you want to happen immediately upon del mything, because it may not happen that way.
*I believe it is never. Inevitably someone will likely downvote my answer and leave a comment discussing the exception to the rule. But whatevs.
No. Calling del on a list element only removes a reference to an object from the list, it doesn't do anything (explicitly) to the object itself. However: If the reference in the list was the last one referring to the object, the object can now be destroyed and recycled. I think that the "normal" CPython implementation will immediately destroy and recycle the object, other variants' behaviour can vary.
If your object is resource-heavy and you want to be sure that the resources are freed correctly, use the with() construct. It's very easy to leak resources when relying on destructors. See this SO post for more details.

Python: how to "kill" a class instance/object?

I want a Roach class to "die" when it reaches a certain amount of "hunger", but I don't know how to delete the instance. I may be making a mistake with my terminology, but what I mean to say is that I have a ton of "roaches" on the window and I want specific ones to disappear entirely.
I would show you the code, but it's quite long. I have the Roach class being appended into a Mastermind classes roach population list.
In general:
Each binding variable -> object increases internal object's reference counter
there are several usual ways to decrease reference (dereference object -> variable binding):
exiting block of code where variable was declared (used for the first time)
destructing object will release references of all attributes/method variable -> object references
calling del variable will also delete reference in the current context
after all references to one object are removed (counter==0) it becomes good candidate for garbage collection, but it is not guaranteed that it will be processed (reference here):
CPython currently uses a reference-counting scheme with (optional)
delayed detection of cyclically linked garbage, which collects most
objects as soon as they become unreachable, but is not guaranteed to
collect garbage containing circular references. See the documentation
of the gc module for information on controlling the collection of
cyclic garbage. Other implementations act differently and CPython may
change. Do not depend on immediate finalization of objects when they
become unreachable (ex: always close files).
how many references on the object exists, use sys.getrefcount
module for configure/check garbage collection is gc
GC will call object.__ del__ method when destroying object (additional reference here)
some immutable objects like strings are handled in a special way - e.g. if two vars contain same string, it is possible that they reference the same object, but some not - check identifying objects, why does the returned value from id(...) change?
id of object can be found out with builtin function id
module memory_profiler looks interesting - A module for monitoring memory usage of a python program
there is lot of useful resources for the topic, one example: Find all references to an object in python
You cannot force a Python object to be deleted; it will be deleted when nothing references it (or when it's in a cycle only referred to be the items in the cycle). You will have to tell your "Mastermind" to erase its reference.
del somemastermind.roaches[n]
for i,roach in enumerate(roachpopulation_list)
if roach.hunger == 100
del roachpopulation_list[i]
break
Remove the instance by deleting it from your population list (containing all the roach instances.
If your Roaches are Sprites created in Pygame, then a simple command of .kill would remove the instance.

Why doesn't __del__ work properly

I think this is the most common question on interviews:
class A:
def __init__(self, name):
self.name = name
def __del__(self):
print self.name,
aa = [A(str(i)) for i in range(3)]
for a in aa:
del a
And so what output of this code and why.
Output will be is nothing and why?
Thats because a is ref on object in list and then we call del method we remove this ref but not object?
There are at least 2 references to the object that a references (variables are references to objects, they are not the objects themselves). There's the one reference inside the list, and then there's the reference "a". When you del a, you remove one reference (the variable a) but not the reference from inside the list.
Also note that Python doesn't guarantee that __del__ will ever be called ...
__del__ gets called when an object is destroyed; this will happen after the last possible reference to the object is removed from the program's accessible memory. Depending on the implementation this might happen immediately or might be after some time.
Your code just removes the local name a from the execution scope; the object remains in the list so is still accessible. Try writing del aa[0], for example.
From the docs:
Note del x doesn’t directly call x.__del__() — the former decrements the reference count for x by one, and the latter is only called when x‘s reference count reaches zero.
__del__ is triggered when the garbage collector finds an object to be destroyed. The garbage collector will try to destroy objects with a reference count of 0. del just decouples the label in the local namespace, thereby decrementing the reference count for the object in the interpreter. The behavior of the garbage collector is for the most part considered an implementation detail of the interpreter, so there's no guarantee that __del__ on objects will be called in any specific order, or even at all. That's why the behavior of this code is undefined.

Weak References in python

I have been trying to understand how python weak reference lists/dictionaries work. I've read the documentation for it, however I cannot figure out how they work, and what they can be used for. Could anyone give me a basic example of what they do and an explanation of how they work?
(EDIT) Using Thomas's code, when i substitute obj for [1,2,3] it throws:
Traceback (most recent call last):
File "C:/Users/nonya/Desktop/test.py", line 9, in <module>
r = weakref.ref(obj)
TypeError: cannot create weak reference to 'list' object
Theory
The reference count usually works as such: each time you create a reference to an object, it is increased by one, and whenever you delete a reference, it is decreased by one.
Weak references allow you to create references to an object that will not increase the reference count.
The reference count is used by python's Garbage Collector when it runs: any object whose reference count is 0 will be garbage collected.
You would use weak references for expensive objects, or to avoid circle references (although the garbage collector usually does it on its own).
Usage
Here's a working example demonstrating their usage:
import weakref
import gc
class MyObject(object):
def my_method(self):
print 'my_method was called!'
obj = MyObject()
r = weakref.ref(obj)
gc.collect()
assert r() is obj #r() allows you to access the object referenced: it's there.
obj = 1 #Let's change what obj references to
gc.collect()
assert r() is None #There is no object left: it was gc'ed.
Just want to point out that weakref.ref does not work for built-in list because there is no __weakref__ in the __slots__ of list.
For example, the following code defines a list container that supports weakref.
import weakref
class weaklist(list):
__slots__ = ('__weakref__',)
l = weaklist()
r = weakref.ref(l)
The point is that they allow references to be retained to objects without preventing them from being garbage collected.
The two main reasons why you would want this are where you do your own periodic resource management, e.g. closing files, but because the time between such passes may be long, the garbage collector may do it for you; or where you create an object, and it may be relatively expensive to track down where it is in the programme, but you still want to deal with instances that actually exist.
The second case is probably the more common - it is appropriate when you are holding e.g. a list of objects to notify, and you don't want the notification system to prevent garbage collection.
Here is the example comparing dict and WeakValueDictionary:
class C: pass
ci=C()
print(ci)
wvd = weakref.WeakValueDictionary({'key' : ci})
print(dict(wvd), len(wvd)) #1
del ci
print(dict(wvd), len(wvd)) #0
ci2=C()
d=dict()
d['key']=ci2
print(d, len(d))
del ci2
print(d, len(d))
And here is the output:
<__main__.C object at 0x00000213775A1E10>
{'key': <__main__.C object at 0x00000213775A1E10>} 1
{} 0
{'key': <__main__.C object at 0x0000021306B0E588>} 1
{'key': <__main__.C object at 0x0000021306B0E588>} 1
Note how in the first case once we del ci the actual object will be also removed from the dictionary wvd.
In the case or regular Python dictionary dict class, we may try to remove the object but it will still be there as shown.
Note: if we use del, we do not to call gc.collect() after that, since just del effectively removes the object.

Deleting attributes when deleting instance

class A:
def __get(self):
return self._x
def __set(self, y):
self._x = y
def __delete_x(self):
print('DELETING')
del self._x
x = property(__get,__set,__delete_x)
b = A()
# Here, when b is deleted, i'd like b.x to be deleted, i.e __delete_x()
# called (and for immediate consequence, "DELETING" printed)
del b
The semantics of the del statement don't really lend themselves to what you want here. del b simple removes the reference to the A object you just instantiated from the local scope frame / dictionary; this does not directly cause any operation to be performed on the object itself. If that was the last reference to the object, then the reference count dropping to zero, or the garbage collector collecting a cycle, may cause the object to be deallocated. You could observe this by adding a __del__ method to the object, or by adding a weakref callback that performs the desired actions.
Neither of the latter two solutions seems like a great idea, though; __del__ methods prevent the garbage collector from collecting any cycles involving the object; and while weakrefs do not suffer from this problem, in either case you may be running in a strange environment (such as during program shutdown), which may make it difficult to get done what you want to accomplish.
If you can expand on your exact use case, it may be that there is an entirely different approach to accomplishing your desired end goal, but it is difficult to speculate based on such a general and limited example.
To control what happens when an instance of class A goes away (whether by being deleted or garbage collected), you can implement special method __del__(self) in A. If you want to have your code involved when a specific attribute of that instance goes away, you can either wrap that attribute with a wrapper class which has __del__, or, probably better in most cases, use the weakref module (however, not all types are subject to being target of weak references, so you may also need some wrapping for this case).
Avoiding __del__ is generally preferable, if you possibly can, because it can interfere with garbage collection and thereby cause "memory leaks" if and when you have circular references.
An ugly way to do it would be :
def __del__(self):
for x in dir(self.__class__):
if type(getattr(self.__class__, x)) == property:
getattr(self.__class__, x).fdel(self)

Categories

Resources