Usage of Cython directive no_gc

Usage of Cython directive no_gc - python

In Cython 0.25 the no_gc directive was added. The documentation for this new directive (as well as for a related no_gc_clear directive) can be found here, but the only thing I really understand about it is that it can speed up your code be disabling certain aspects of garbage collection.
I am interested because I have some high performance Cython code which uses extension types, and I understand that no_gc can speed things up further. In my code, instances of extension types are always left alive until the very end when the program closes, which makes me think that disabling garbage collection for these might be OK.
I guess what I really need is an example where the usage of no_gc goes bad and leads to memory leaks, together with en explanation of exactly why that happens.

It's to do with circular references - when instance a holds a reference to a Python object that references a again then a can never be freed through reference counting so Python tries to detect the cycle.
A very trial example of a class that could cause issues is:
# Cython code:
cdef class A:
cdef param
def __init__(self):
self.param = self
(and some Python code to run it)
import cython_module
while True:
cython_module.A()
This is fine as is (the cycles are detected and they get deallocated every so often) but if you add no_gc then you will run out of memory.
A more realistic example might be a parent/child pair that store a reference to each other.
It's worth adding that the performance gains likely to be small. The garbage collector is only run occasionally in situations when a lot of objects have been allocated and few have been freed (https://docs.python.org/3/library/gc.html - see set_threshold). It's hopefully unlikely that this describes your high performance code.
There's probably also a small performance cost on allocation and deallocation of your objects with GC, to add/remove them from the list of tracked objects (but again, hopefully you aren't allocating/deallocting huge numbers)
Finally, if your class never stores any references to Python objects then it's effectively no_gc anyway. Setting the option will do no harm but also do no good.

Related

Does this code create a memory leak in python?

Consider the following code for illustration propose:
import mod
f1s = ["A1", "B1", "C1"]
f2s = ["A2", "B2", "C2"]
for f1, f2 in zip(f1s,f2s):
# Creating an object
acumulator = mod.AcumulatorObject()
# Using object
acumulator.append(f1)
acumulator.append(f2)
# Output of object
acumulator.print()
So, I use an instance of a class at the beginning of the for to perform an operation. For each tuple in the for I need to perform the same action, however I can not use the same object because it would add the effect of the last iteration. Therefore, at the beginning of every iteration I create a new instance.
My question is if by doing this a memory leak is created? What action I have to do for each object created? (Delete it maybe? Or by assign the new object to the same name it is cleared?)

tl,dr; no
The reference implementation of Python uses reference counting for garbage collection. There are other implementations that use different GC strategies and this affects the precise time at which __del__ methods are called, which may or may not be reliable or timely in PyPy, Jython or IronPython. These differences are not important unless when you are dealing with resources like file pointers and other expensive system resources.
In cPython the GC will wipe out objects when the referencing count is zero. For example, when you do acumulator = mod.AcumulatorObject() inside a for loop, a new object replaces the old one at the next iteration - and since there are no other variables referencing the old object it will be garbage collected in the next GC pass. The reference implementation cPython will spoil you with things like releasing resources automatically when they go out of scope but YMMV regarding other implementations.
That is why many people commented memory leaks are not of concern in Python.
You have complete control over cPython's garbage collector using the cg module. The default settings are pretty conservative and in 10 years doing Python for a living I never had to fire a GC cycle manually - but I've seen a situation where delaying it helped performance:
Yes, I had previously played with sys.setcheckinterval. I changed it to 1000 (from its default of 100), but it didn't do any measurable difference. Disabling Garbage Collection has helped - thanks. This has been the biggest speedup so far - saving about 20% (171 minutes for the whole run, down to 135 minutes) - I'm not sure what the error bars are on that, but it must be a statistically significant increase.
Just follow best practices like wrapping system resources using with or (try/finally blocks) and you should have no problems.

Memory deallocation in Linked list Python

While deleting the end node from the linked list we just set the link of the node pointing to the end node to "None".Does that mean that the end node is destroyed and memory occupied by it has been released?

You ask: "Does that mean that the end node is destroyed and memory occupied by it has been released?"
With the little information you have given the answer to your question as you posed it is definitely not a plain unqualified "yes".
The simplest example of why "yes" is wrong is that if there is any other reference to that end node, then it can't immediately be released - if that were the case then nothing much would work, would it? However that doesn't mean the node won't ever be regarded as deleteable.
Moreover, even once releasable, that doesn't mean the memory "has been released" - this is implementation-dependent and may well not be deterministic, i.e. you can't necessarily rely on the memory having been immediately released, or predict when (if ever) it is actually released.
The "garbage collector" metaphor is used to refer to recovering unused memory because IRL garbage collection happens every now and then but can't be relied on to happen (or have happened) at a particular time.
What happens to unreferenced data is nothing to do with the language specification, which is another reason why the answer is not a plain "yes". It is completely implementation-dependent. You don't say if you are using cPython or Jython, or some other flavour. You need to refer to the documentation for the implementation you are using. cPython does expose its garbage collector, refer to e.g. https://docs.python.org/2/library/gc.html and https://docs.python.org/3/library/gc.html, and Jython uses the Java garbage collector. You may or may not be able to influence their behaviour, you should refer to the documentation for the interpreter you are using.
The reasons for not necessarily immediately recycling releasable memory are usually to do with performance - why do work which isn't needed? - but if your interpreter does postpone recycling then it will at some point, when based on some criteria resources become limited, make some effort to tidy up - do the garbage collection - and this means that 99.9...% of the time you don't need to concern yourself with the recycling because it is automatically handled (with corresponding overhead cost) once the interpreter implementation considers it necessary.

Yes.
Python has a garbage collector so objects that cannot be reached in any way are automatically destroyed and their memory will be reused for other objects created in the future.

Are all Python objects tracked by the garbage collector?

I'm trying to debug a memory leak (see question Memory leak in Python Twisted: where is it?).
When the garbage collector is running, does it have access to all Python objects created by the Python interpreter? If we suppose Python C libraries are not leaking, should RSS memory usage grow linearly with respect to the GC object count? What about sys.getobjects?

CPython uses two mechanisms to clean up garbage. One is reference counting, which affects all objects but which can't clean up objects that (directly or indirectly) refer to each other. That's where the actual garbage collector comes in: python has the gc module, which searches for cyclic references in objects it knows about. Only objects that can potentially be part of a reference cycle need to worry about participating in the cyclic gc. So, for example, lists do, but strings do not; strings don't reference any other objects. (In fact, the story is a little more complicated, as there's two ways of participating in cyclic gc, but that isn't really relevant here.)
All Python classes (and instances thereof) automatically get tracked by the cyclic gc. Types defined in C aren't, unless they put in a little effort. All the builtin types that could be part of a cycle do. But this does mean the gc module only knows about the types that play along.
Apart from the collection mechanism there's also the fact that Python has its own aggregating memory allocator (obmalloc), which allocates entire memory arenas and uses the memory for most of the smaller objects it creates. Python now does free these arenas when they're completely empty (for a long time it didn't), but actually emptying an arena is fairly rare: because CPython objects aren't movable, you can't just move some stragglers to another arena.

The RSS does not grow linearly with the number of Python objects, because Python objects can vary in size. An int object is usually much smaller than a big list.
I suppose that you mean gc.get_objects when you wrote sys.getobjects. This function gives you a list of all reachable objects. If you suppose a leak, you can iterate this list and try to find objects that should have been freed already. (For instance you might know that all objects of a certain type are to be freed at a certain point.)

A Python class designed to be unable to be involved in cycles is not tracked by the GC.
class V(object):
__slots__ = ()
Instances of V cannot have any attribute. Its size is 16, like the size of object().
sys.getsizeof(V()) and v().sizeof() return the same value: 16.
V isn't useful, but I imagine that other classes derived from base types (e.g. tuple), that only add methods, can be crafted so that reference counting is enough to manage them in memory.

What does cpython do to help detect object cycles(reference counting)?

From what I've read about cpython it seems like it does reference counting + something extra to detect/free objects pointing to each other.(Correct me if I'm wrong). Could someone explain the something extra? Also does this guarantee* no cycle leaking? If not is there any research into an algorithm proven to add to reference counting to make it never leak*? Would this be just running a non reference counting tracing gc every so often?
*discounting bugs and problems with modules using foreign function interface

As explained in the documentation for gc.garbage, there is no guarantee that no leaks occur; specifically, cyclic objects with __del__ methods are not collected by default. For such objects, the cyclic links have to be manually broken to enable further GC.
From what I understand by browsing the CPython sourcecode, the interpreter keeps references to all objects under its control. The "extra" garbage collector runs a mark-and-sweep-like algorithm through the heap, remembers for each object if it is reachable from the "outside" and, if not, deletes it. (The GC is generational, but it may be run explicitly from the gc module with a generation argument.)
The only efficient algorithm that I could think of that satisfies your criteria would indeed be a "full" GC algorithm to augment reference counting and this is what seems to be implemented in Python. I'm not an expert in these matters though.

Python: Behavior of the garbage collector

I have a Django application that exhibits some strange garbage collection behavior. There is one view in particular that will just keep growing the VM size significantly every time it is called - up to a certain limit, at which point usage drops back again. The problem is that it's taking considerable time until that point is reached, and in fact the virtual machine running my app doesn't have enough memory for all FCGI processes to take as much memory as they then sometimes do.
I've spent the last two days investigating this and learning about Python garbage collection, and I think I do understand what is happening now - for the most part. When using
gc.set_debug(gc.DEBUG_STATS)
Then for a single request, I see the following output:
>>> c = django.test.Client()
>>> c.get('/the/view/')
gc: collecting generation 0...
gc: objects in each generation: 724 5748 147341
gc: done.
gc: collecting generation 0...
gc: objects in each generation: 731 6460 147341
gc: done.
[...more of the same...]
gc: collecting generation 1...
gc: objects in each generation: 718 8577 147341
gc: done.
gc: collecting generation 0...
gc: objects in each generation: 714 0 156614
gc: done.
[...more of the same...]
gc: collecting generation 0...
gc: objects in each generation: 715 5578 156612
gc: done.
So essentially, a huge amount of objects are allocated, but are initially moved to generation 1, and when gen 1 is sweeped during the same request, they are moved to generation 2. If I do a manual gc.collect(2) afterwards, they are removed. And, as I mentioned, there also removed when the next automatic gen 2 sweep happens, which, if I understand correctly, would in this case something like every 10 requests (at this point the app needs about a 150MB).
Alright, so initially I thought that there might be some cyclic referencing going on within the processing of one request that prevents any of these objects from being collected within the handling of that request. However, I've spent hours trying to find one using pympler.muppy and objgraph, both after and by debugging inside the request processing, and there don't seem to be any. Rather, it seems the 14.000 objects or so that are created during the request are all within a reference chain to some request-global object, i.e. once the request goes away, they can be freed.
That has been my attempt at explaining it, anyway. However, if that's true and there are indeed no cycling dependencies, shouldn't the whole tree of objects be freed once whatever request object that causes them to be held goes away, without the garbage collector being involved, purely by virtue of the reference counts dropping to zero?
With that setup, here are my questions:
Does the above even make sense, or do I have to look for the problem elsewhere? Is it just an unfortunate accident that significant data is kept around for so long in this particular use case?
Is there anything I can do to avoid the issue. I already see some potential to optimize the view, but that appears to be a solution with limited scope - although I am not sure what I generic one would be, either; how advisable is it for example to call gc.collect() or gc.set_threshold() manually?
In terms of how the garbage collector itself works:
Do I understand correctly that an object is always moved to the next generation if a sweep looks at it and determines that the references it has are not cyclic, but can in fact be traced to a root object.
What happens if the gc does a, say, generation 1 sweep, and finds an object that is referenced by an object within generation 2; does it follow that relationship inside generation 2, or does it wait for a generation 2 sweep to occur before analyzing the situation?
When using gc.DEBUG_STATS, I care primarily about the "objects in each generation" info; however, I keep getting hundreds of "gc: 0.0740s elapsed.", "gc: 1258233035.9370s elapsed." messages; they are totally inconvenient - it takes considerable time for them to be printed out, and they make the interesting things a lot harder to find. Is there a way to get rid of them?
I don't suppose there is a way to do a gc.get_objects() by generation, i.e. only retrieve the objects from generation 2, for example?

Does the above even make sense, or do I have to look for the problem elsewhere? Is it just an unfortunate accident that significant data is kept around for so long in this particular use case?
Yes, it does make sense. And yes, there are other issues worth to consider. Django uses threading.local as base for DatabaseWrapper (and some contribs use it to make request object accessible from places where it's not passed explicitly). These global objects survive requests and can keep references to objects till some other view is handled in the thread.
Is there anything I can do to avoid the issue. I already see some potential to optimize the view, but that appears to be a solution with limited scope - although I am not sure what I generic one would be, either; how advisable is it for example to call gc.collect() or gc.set_threshold() manually?
General advice (probably you know it, but anyway): avoid circular references and globals (including threading.local). Try to break cycles and clear globals when django design makes hard to avoid them. gc.get_referrers(obj) might help you to find places requiring attention. Another way it to disable garbage collector and call it manually after each request, when it's the best place to do (this will prevent objects from moving to the next generation).
I don't suppose there is a way to do a gc.get_objects() by generation, i.e. only retrieve the objects from generation 2, for example?
Unfortunately this is not possible with gc interface. But there are several ways to go. You can consider the end of list returned by gc.get_objects() only, since objects in this list are sorted by generation. You can compare the list with one returned from previous call by storing weak references to them (e.g. in WeakKeyDictionary) between calls. You can rewrite gc.get_objects() in your own C module (it's easy, mostly copy-paste programming!) since they are stored by generation internally, or even access internal structures with ctypes (requires quite deep ctypes understanding).

I think your analysis looks sound. I'm not an expert on the gc, so whenever I have a problem like this I just add a call to gc.collect() in an appropriate, non time critical place, and forget about it.
I'd suggest you call gc.collect() in your view(s) and see what effect it has on your response time and your memory usage.
Note also this question which suggests that setting DEBUG=True eats memory like it is nearly past its sell by date.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.