How to check for memory leaks in Guile extension modules? - python

I develop an extension module for Guile, written in C. This extension module embeds a Python interpreter.
Since this extension module invokes the Python interpreter, I need to verify that it properly manages the memory occupied by Python objects.
I found that the Python interpreter is well-behaved in its own memory handling, so that by running valgrind I can find memory leaks due to bugs in my own Python interpreter embedding code, if there are no other interfering factors.
However, when I run Guile under valgrind, valgrind reports memory leaks. Such memory leaks obscure any memory leaks due to my own code.
The question is what can I do to separate memory leaks due to bugs in my code from memory leaks reported by valgrind as due to Guile. Another tool instead of valgrind? Special valgrind options? Give up and rely upon manual code walkthrough?

You've got a couple options. One is to write a supressions file for valgrind that turns off reporting of stuff that you're not working on. Python has such a file, for example:
http://svn.python.org/projects/python/trunk/Misc/valgrind-python.supp
If valgrind doesn't like your setup, another possibility is using libmudflap; you compile your program with gcc -fmudflap -lmudflap, and the resulting code is instrumented for pointer debugging. Described in the gcc docs, and here: http://gcc.gnu.org/wiki/Mudflap_Pointer_Debugging

Related

Instrument memory access of python scripts

My research requires processing memory traces of applications. For C/C++ programs, this is easy using Intel's PIN library. However, as suggested here Use Intel Pin to instrument Python scripts, I may need to instrument the Python runtime itself, which I'm not sure will represent the true memory behavior of a given python script due to some overheads(If this is not the case, please comment). Some of the existing python memory profilers only talk about the runtime memory "usage" in terms of the heap space usage, etc.
I ended up making an executable from my python script using PyInstaller and running my PINTool over it. However, I'm not sure if this is the right approach.
Is there any way(any library or any hack into the python runtime) that may help in getting the memory traces accessed by the python scripts?

Python C extension memory leakage

I have written some C extension for my python programs, and I just noticed that there are some memory leakage problem, however, the C program itself won't leak memory, so I guess there is some problem in reference count. Currently when I use python console to run my program, after the computation is finished, the total memory of python3 is really big, indicating some objects are not released, is there anyway I can know what objects are there or when the objects are allocated?
The C extension is part of a big package, so it is impossible to paste the whole package here.

How to change the stack size of subprocess in Python

Here's my somewhat complicated setup that has just begun to cause a StackOverflow exception a couple of days ago:
On my windows-based continuous integration platform I have got a Jenkins job that starts a Python script.
This Python script runs a cmake command, an msbuild call and then executes the newly compiled gtest-based test framwork.
The msbuild produces a dll and the gtest executable. The executable itself then loads the dll in order to test it.
A couple of days ago I made some changes in the source code of the dll that alter the memory footprint of some of my structures (basically just array lengths). It's plain C code. Now some of the tests exit with a stack-overflow exception.
I admit I'm putting some data structures on the stack that don't necessarily have to be there but it's the best I've got for information hiding in C (better than using static global variables).
if(myCondition)
{
int hugeBuffer[20000];
...
}
Apart from that there is no recursion or anything fancy going on that could be a legit source of trouble. Large chunks of data are always passed by reference (pointer).
Anyway, the stack overflow exception doesn't occur on my local machine running the gtest executable directly from Visual Studio unless I significantly reduce the reserved stack memory in the linker settings.
Then in debug mode I clearly run into a point where the stack just overflows at the beginning of a function.
Unfortunately I couldn't find any way of debugging how full the stack is. In VS I've only got the call stack window which doesn't show the current "fill level" of the stack.
So although you guys might kill mir for this I'm guessing I really just don't have enough stack memory available when running the Jenkins job.
So I'm wondering what step actually defines the amount of stack memory available for my DLL code. It's clearly less than the default 10MB I have in VisualStudio on my local machine.
In the msbuild step there is no STACK parameter used for the linker so I'm guessing the exe header should contain the same value as in Visual Studio (10MB?).
The Python script runs a subprocess.call which could ignore the value set by the linker and overwrite it. I could neither find any information on that nor on how to change the stack memory allocated. I don't even know whether it spawns a thread or a process which may also affect the stack size.
The DLL loading mechanism in windows is also somewhat mysterious to me but I'm guessing the dll uses the same stack as the executable using it. I'm using the LoadLibrary() macro from WinBase.h.
by sheer luck I found out that although the same CMakeLists.txt is used for creating the projects (locally and on Jenkins) the resulting projects generated from them differ.
My local projects (and the GUI-VisualStudio solution I manually created on the Server for error finding) had 10MB of stack reserved whereas the python-based call to CMake was only using the default value of 1MB which is not enough for the project.
This strange behavior of Cmake may be compensated for by adding this line to CMakeLists.txt:
# Set linker to use 10MB of stack memory for all gtest executables, the default of 1MB is not enough!
SET( CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} /STACK:\"10000000\"")
sorry to bother you :)
I'm glad it's over

Find memory leaks when extending python with C

I wrote some C code to create a python module. I wrote the code myself (did not use SWIG etc). If you're interested, the C code is at the bottom of this thread.
Q: Is there any way to (hopefully, easily) find whether my C code has memory leaks? Is there any way to use python's awesomeness for finding memory leaks in its extensions?
if you using a linux environment we can easily find the memory leaks by using debugger named valgrind.
To get the valgrind first you have to install it from the internet by using command
sudo apt-get valgrind
after the installation is completed compile your c code using this debugger you can easily find the memory leaks. This debugger shows what is the reason for the memory leak and also specifies the line at which the leak has been occured.

Python - Memory Leak

I'm working on solving a memory leak in my Python application.
Here's the thing - it really only appears to happen on Windows Server 2008 (not R2) but not earlier versions of Windows, and it also doesn't look like it's happening on Linux (although I haven't done nearly as much testing on Linux).
To troubleshoot it, I set up debugging on the garbage collector:
gc.set_debug(gc.DEBUG_UNCOLLECTABLE | gc.DEBUG_INSTANCES | gc.DEBUG_OBJECTS)
Then, periodically, I log the contents of gc.garbage.
Thing is, gc.garbage is always empty, yet my memory usage goes up and up and up.
Very puzzling.
If there's never any garbage in gc.garbage, then I'm not sure what you're trying to do by enabling GC debugging. Sure, it'll tell you which objects are considered for cleanup, but that's not particularly interesting if you end up with no circular references that can't be cleaned up.
If your program is using more and more memory according to the OS, there can generally be four different cases at play:
Your application is storing more and more things, keeping references to each one so they don't get collected.
Your application is creating circular references between objects that can't be cleaned up by the gc module (typically because one of them has a __del__ method.)
Your application is freeing (and re-using) memory, but the OS doesn't want the memory re-used, so it keeps allocating new blocks of memory.
The leak is a real memory leak but in a C/C++ extension module your code is using.
From your description it sounds like it's unlikely to be #1 (as it would behave the same on any OS) and apparently not #2 either (since there's nothing in gc.garbage.) Considering #3, Windows (in general) has a memory allocator that's notoriously bad with fragmented allocations, but Python works around this with its obmalloc frontend for malloc(). It may still be an issue specific in Windows Server 2008 system libraries that make it look like your application is using more and more memory, though. Or it may be a case of #4, a C/C++ extension module, or a DLL used by Python or an extension module, with a memory leak.
In general, the first culprit for memory leaks in python is to be found in C extensions.
Do you use any of them?
Furthermore, you say the issue happens only on 2008; I would then check extensions for any incompatibility, because with Vista and 2008 there were quite a lot of small changes that caused issues on that field.
As and alternative, try to execute your application in Windows compatibility mode, choosing Windows XP - this could help solving the issue, especially if it's related to changes in the security.

Categories

Resources