Remove object from list after lifetime expires - python

I am creating a program that spawns objects randomly. These objects have a limited lifetime.
I create these objects and place them in a list. The objects keep track of how long they exist and eventually expire. They are no longer needed after expiration.
I would like to delete the objects after they expire but I'm not sure how to reference the specific object in the list to delete it.
if something:
list.append(SomeObject())
---- later---
I would like a cleanup process that looks at the variable in the Object and if it is expired, then remove it from the list.
Thanks for your help in advance.

You can use the refCount in case you define "no longer used" as "no other object keeps a reference". Which is a good way, for as no references exist, the object can no longer be accessed and may be disposed of. In fact, Python's garbage collector will do that for you.
Where it goes wrong is when you also have all the instances in a list. That also counts as a refeference to the object and it therefore never will be disposed of.
For example, a list of state variables that are not only referenced by their owning objects, but also by a list to allow linear access. Explicitly call a cleanup function from the accessor to keep the list clean:
GlobalStateList = []
def gcOnGlobalStateList():
for s in reversed(GlobalStateList):
if (getrefcount(s) <= 3): # 1=GlobalStateList, 2=Iterator, 3=getrefcount()
GlobalStateList.remove(s)
def getGlobalStateList():
gcOnGlobalStateList()
return GlobalStateList
Note that even looking at the refcount increases it, so the test-value is three or less.

Assuming that your concept of SomeObject "expiry" is not directly related to the underlying Python object lifetime (reference counting, etc.) I would suggest that the easiest way to purge the list is to occasionally run through it, dereferencing any expired objects:
lst = [obj for obj in lst if not obj.expired]
Note that you shouldn't call your own variables list, as this will shadow the built-in.

Related

Garbage collection for a simple python class

I am writing a python class like this:
class MyImageProcessor:
def __init__ (self, image, metadata):
self.image=image
self.metadata=metadata
Both image and metadata are objects of a class written by a
colleague. Now I need to make sure there is no waste of memory. I am thinking of defining a quit() method like this,
def quit():
self.image=None
self.metadata=None
import gc
gc.collect()
and suggest users to call quit() systematically. I would like to know whether this is the right way. In particular, do the instructions in quit() above guarantee that unused memories being well collected?
Alternatively, I could rename quit() to the build-in __exit__(), and suggest users to use the "with" syntax. But my question is
more about whether the instructions in quit() indeed fulfill the garbage collection work one would need in this situation.
Thank you for your help.
In python every object has a built-in reference_count, the variables(names) you create are only pointers to the objects. There are mutable and unmutable variables (for example if you change the value of an integer, the name will be pointed to another integer object, while changing a list element will not cause changing of the list name).
Reference count basically counts how many variable uses that data, and it is incremented/decremented automatically.
The garbage collector will destroy the objects with zero references (actually not always, it takes extra steps to save time). You should check out this article.
Similarly to object constructors (__init__()), which are called on object creation, you can define destructors (__del__()), which are executed on object deletion (usually when the reference count drops to 0). According to this article, in python they are not needed as much needed in C++ because Python has a garbage collector that handles memory management automatically. You can check out those examples too.
Hope it helps :)
No need for quit() (Assuming you're using C-based python).
Python uses two methods of garbage collection, as alluded to in the other answers.
First, there's reference counting. Essentially each time you add a reference to an object it gets incremented & each time you remove the reference (e.g., it goes out of scope) it gets decremented.
From https://devguide.python.org/garbage_collector/:
When an object’s reference count becomes zero, the object is deallocated. If it contains references to other objects, their reference counts are decremented. Those other objects may be deallocated in turn, if this decrement makes their reference count become zero, and so on.
You can get information about current reference counts for an object using sys.getrefcount(x), but really, why bother.
The second way is through garbage collection (gc). [Reference counting is a type of garbage collection, but python specifically calls this second method "garbage collection" -- so we'll also use this terminology. ] This is intended to find those places where reference count is not zero, but the object is no longer accessible. ("Reference cycles") For example:
class MyObj:
pass
x = MyObj()
x.self = x
Here, x refers to itself, so the actual reference count for x is more than 1. You can call del x but that merely removes it from your scope: it lives on because "someone" still has a reference to it.
gc, and specifically gc.collect() goes through objects looking for cycles like this and, when it finds an unreachable cycle (such as your x post deletion), it will deallocate the whole lot.
Back to your question: You don't need to have a quit() object because as soon as your MyImageProcessor object goes out of scope, it will decrement reference counters for image and metadata. If that puts them to zero, they're deallocated. If that doesn't, well, someone else is using them.
Your setting them to None first, merely decrements the reference count right then, but when MyImageProcessor goes out of scope, it won't decrement those reference count again, because MyImageProcessor no longer holds the image or metadata objects! So you're just explicitly doing what python does for you already for free: no more, no less.
You didn't create a cycle, so your calling gc.collect() is unlikely to change anything.
Check out https://devguide.python.org/garbage_collector/ if you are interested in more earthy details.
Not sure if it make sense but to my logic you could
Use :
gc.get_count()
before and after
gc.collect()
to see if something has been removed.
what are count0, count1 and count2 values returned by the Python gc.get_count()

How do you know in advance if a method (or function) will alter the variable when called?

I am new to Python from R. I have recently spent a lot of time reading up on how everything in Python is an object, objects can call methods on themselves, methods are functions within a class, yada yada yada.
Here's what I don't understand. Take the following simple code:
mylist = [3, 1, 7]
If I want to know how many times the number 7 occurs, I can do:
mylist.count(7)
That, of course, returns 1. And if I want to save the count number to another variable:
seven_counts = mylist.count(7)
So far, so good. Other than the syntax, the behavior is similar to R. However, let's say I am thinking about adding a number to my list:
mylist.append(9)
Wait a minute, that method actually changed the variable itself! (i.e., "mylist" has been altered and now includes the number 9 as the fourth digit in the list.) Assigning the code to a new variable (like I did with seven_counts) produces garbage:
newlist = mylist.append(9)
I find the inconsistency in this behavior a bit odd, and frankly undesirable. (Let's say I wanted to see what the result of the append looked like first and then have the option to decide whether or not I want to assign it to a new variable.)
My question is simple:
Is there a way to know in advance if calling a particular method will actually alter your variable (object)?
Aside from reading the documentation (which for some methods will include type annotations specifying the return value) or playing with the method in the interactive interpreter (including using help() to check the docstring for a type annotation), no, you can't know up front just by looking at the method.
That said, the behavior you're seeing is intentional. Python methods either return a new modified copy of the object or modify the object in place; at least among built-ins, they never do both (some methods mutate the object and return a non-None value, but it's never the object just mutated; the pop method of dict and list is an example of this case).
This either/or behavior is intentional; if they didn't obey this rule, you'd have had an even more confusing and hard to identify problem, namely, determining whether append mutated the value it was called on, or returned a new object. You definitely got back a list, but is it a new list or the same list? If it mutated the value it was called on, then
newlist = mylist.append(9)
is a little strange; newlist and mylist would be aliases to the same list (so why have both names?). You might not even notice for a while; you'd continue using newlist, thinking it was independent of mylist, only to look at mylist and discover it was all messed up. By having all such "modify in place" methods return None (or at least, not the original object), the error is discovered more quickly/easily; if you try and use newlist, mistakenly believing it to be a list, you'll immediately get TypeErrors or AttributeErrors.
Basically, the only way to know in advance is to read the documentation. For methods whose name indicates a modifying operation, you can check the return value and often get an idea as to whether they're mutating. It helps to know what types are mutable in the first place; list, dict, set and bytearray are all mutable, and the methods they have that their immutable counterparts (aside from dict, which has no immutable counterpart) lack tend to mutate the object in place.
The default tends to be to mutate the object in place simply because that's more efficient; if you have a 100,000 element list, a default behavior for append that made a new 100,001 element list and returned it would be extremely inefficient (and there would be no obvious way to avoid it). For immutable types (e.g. str, tuple, frozenset) this is unavoidable, and you can use those types if you want a guarantee that the object is never mutate in place, but it comes at a cost of unnecessary creation and destruction of objects that will slow down your code in most cases.
Just checkout the doc:
>>> list.count.__doc__
'L.count(value) -> integer -- return number of occurrences of value'
>>> list.append.__doc__
'L.append(object) -> None -- append object to end'
There isn't really an easy way to tell, but:
immutable object --> no way of changing through method calls
So, for example, tuple has no methods which affect the tuple as it is unchangeable so methods can only return new instances.
And if you "wanted to see what the result of the append looked like first and then have the option to decide whether or not I want to assign it to a new variable" then you can concatenate the list with a new list with one element.
i.e.
>>> l = [1,2,3]
>>> k = l + [4]
>>> l
[1, 2, 3]
>>> k
[1, 2, 3, 4]
Not from merely your invocation (your method call). You can guarantee that the method won't change the object if you pass in only immutable objects, but some methods are defined to change the object -- and will either not be defined for the one you use, or will fault in execution.
I Real Life, you look at the method's documentation: that will tell you exactly what happens.
[I was about to include what Joe Iddon's answer covers ...]

Python: Delete object referenced from tuple

I have a very large object inside a tuple that I would like to explicitly delete. Unfortunately, since tuples are immutable, just doing del tup[0] throws:
TypeError: 'tuple' object doesn't support item deletion
I don't actually need tup at all anymore, so I could delete the whole thing, I just want to make sure that the large object referenced from tup is deleted immediately without waiting for the garbage collector.
Will just deleting tup remove it immediately or is there a better way?
del does not reclaim memory. It does not delete the object. Depending on whether you remove an index (like del somelist[0]), attribute (like del foo.attr) or a variable (like del foo).
del will decrement the reference counter, and if the reference counter hits zero, the memory will be reclaimed, usually immediately.
If there are however cyclic references, the garbage collector has to walk by to remove them. So if you absolutely need to remove the large object with the tuple, you can use:
import gc
del tup
gc.collect()
Note that if other items refer to the large object, it will still not be removed. So make sure that there is only one reference.
For simple variables that have no special delete procedure (some have, like closing a file), del will thus remove the variable from the local scope, and set the reference count one less than it was before. This will usually be faster than setting the variable to None for instance, since in the latter case, it will increment the reference count of the None singleton.

Call destructor of an element from a list

I have something like that:
a = [instance1, instance2, ...]
if I do a
del a[1]
instance2 is removed from list, but is instance2 desctructor method called?
I'm interested in this because my code uses a lot of memory and I need to free memory deleting instances from a list.
Coming from a language like c++ (as I did), this tends to be a subject many people find difficult to grasp when first learning Python.
The bottomline is this: when you do del XXX, you are never* deleting an object when you use del. You are only deleting an object reference. However, in practice, assuming there are no other references laying about to the instance2 object, deleting it from your list will free the memory as you desire.
If you don't understand the difference between an object and an object reference, read on.
Python: Pass by value, or pass by reference?
You are likely familiar with the concept of passing arguments to a function by reference, or by value. However, Python does things differently. Arguments are always passed by object reference. Read this article for a helpful explanation of what this means.
To summarize: this means that when you pass a variable to a function, you are not passing a copy of the value of the variable (pass by value), nor are you passing the object itself - i.e., the address of the value in memory. You are passing the name-object that indirectly refers to the value held in memory.
What does this have to do with del...?
Well, I'll tell you.
Say you do this:
def deleteit(thing):
del thing
mylist = [1,2,3]
deleteit(mylist)
...what do you think will happen? Has mylist been deleted from the global namespace?
The answer is NO:
assert mylist # No error
The reason is that when you do del thing in the deleteit function, you are only deleting a local object reference to the object. That object reference exists ONLY inside of the function. As a sidebar, you might ask: is it possible to delete an object reference from the global namespace while inside a function? The answer is yes, but you have to declare the object reference to be part of the global namespace first:
def deletemylist():
global mylist
del mylist
mylist = [1,2,3]
deletemylist()
assert mylist #ERROR, as expected
Putting it all together
Now to get back to your original question. When, in ANY namespace, you do this:
del XXX
...you have NOT deleted the object signified by XXX. You CAN'T do that. You have only deleted the object reference XXX, which refers to some object in memory. The object itself is managed by the Python memory manager. This is a very important distinction.
Note that as a consequence, when you override the __del__ method of some object, which gets called when the object is deleted (NOT the object reference!):
class MyClass():
def __del__(self):
print(self, "deleted")
super().__del__()
m = MyClass()
del m
...the print statement inside the __del__ method does not necessarily occur immediately after you do del m. It only occurs at the point in time the object itself is deleted, and that is not up to you. It is up to the Python memory manager. When all object references in all the namespaces have been deleted, the __del__ method will eventually be executed. But not necessarily immediately.
The same is true when you delete an object reference that is part of a list, like in the original example. When you do del a[1], only the object reference to the object signified by a[1] is deleted, and the __del__ method of that object may or may not be called immediately (though as stated before, it will eventually be called once there are no more references to the object, and the object is garbage collected by the memory manager).
As a result of this, it is not recommended that you put things in the __del__ method that you want to happen immediately upon del mything, because it may not happen that way.
*I believe it is never. Inevitably someone will likely downvote my answer and leave a comment discussing the exception to the rule. But whatevs.
No. Calling del on a list element only removes a reference to an object from the list, it doesn't do anything (explicitly) to the object itself. However: If the reference in the list was the last one referring to the object, the object can now be destroyed and recycled. I think that the "normal" CPython implementation will immediately destroy and recycle the object, other variants' behaviour can vary.
If your object is resource-heavy and you want to be sure that the resources are freed correctly, use the with() construct. It's very easy to leak resources when relying on destructors. See this SO post for more details.

Python: how to "kill" a class instance/object?

I want a Roach class to "die" when it reaches a certain amount of "hunger", but I don't know how to delete the instance. I may be making a mistake with my terminology, but what I mean to say is that I have a ton of "roaches" on the window and I want specific ones to disappear entirely.
I would show you the code, but it's quite long. I have the Roach class being appended into a Mastermind classes roach population list.
In general:
Each binding variable -> object increases internal object's reference counter
there are several usual ways to decrease reference (dereference object -> variable binding):
exiting block of code where variable was declared (used for the first time)
destructing object will release references of all attributes/method variable -> object references
calling del variable will also delete reference in the current context
after all references to one object are removed (counter==0) it becomes good candidate for garbage collection, but it is not guaranteed that it will be processed (reference here):
CPython currently uses a reference-counting scheme with (optional)
delayed detection of cyclically linked garbage, which collects most
objects as soon as they become unreachable, but is not guaranteed to
collect garbage containing circular references. See the documentation
of the gc module for information on controlling the collection of
cyclic garbage. Other implementations act differently and CPython may
change. Do not depend on immediate finalization of objects when they
become unreachable (ex: always close files).
how many references on the object exists, use sys.getrefcount
module for configure/check garbage collection is gc
GC will call object.__ del__ method when destroying object (additional reference here)
some immutable objects like strings are handled in a special way - e.g. if two vars contain same string, it is possible that they reference the same object, but some not - check identifying objects, why does the returned value from id(...) change?
id of object can be found out with builtin function id
module memory_profiler looks interesting - A module for monitoring memory usage of a python program
there is lot of useful resources for the topic, one example: Find all references to an object in python
You cannot force a Python object to be deleted; it will be deleted when nothing references it (or when it's in a cycle only referred to be the items in the cycle). You will have to tell your "Mastermind" to erase its reference.
del somemastermind.roaches[n]
for i,roach in enumerate(roachpopulation_list)
if roach.hunger == 100
del roachpopulation_list[i]
break
Remove the instance by deleting it from your population list (containing all the roach instances.
If your Roaches are Sprites created in Pygame, then a simple command of .kill would remove the instance.

Categories

Resources