Swig / Python memory leak detected

Swig / Python memory leak detected - python

I have a very complicated class for which I'm attempting to make Python wrappers in SWIG. When I create an instance of the item in Python, however, I'm unable to initialize certain data members without receiving the message:
>>> myVar = myModule.myDataType()
swig/python detected a memory leak of type 'MyDataType *', no destructor found.
Does anyone know what I need to do to address this? Is there a flag I could be using to generate destructors?

SWIG always generates destructor wrappers (unless %nodefaultdtor directive is used). However, in case where it doesn't know anything about a type, it will generate an opaque pointer wrapper, which will cause leaks (and the above message).
Please check that myDataType is a type that is known by SWIG. Re-run SWIG with debug messages turned on and check for any messages similar to
Nothing is known about Foo base type - Bar. Ignored
Receiving a message as above means that SWIG doesn't know your type hierarchy to the full extent and thus operates on limited information - which could cause it to not generate a dtor.

The error message is pretty clear to me, you need to define a destructor for this type.

Related

How should I handle an error in Python 2.7 in the initialization method for a C extension?

In Python 2.7 (which we need to support), the initialization function for a C/C++ extension should be declared with the PyMODINIT_FUNC macro, which effectively makes the function void. However, I'm not sure how we should handle errors that occurs during this function. We could throw a C++ exception within the function, but I'm not thrilled about that idea. Is there a better way?
Here's the background: In order to work around an architectural problem that we cannot address in a single release, we need to have the user call Python via a script that we provide rather than directly from the Python executable. By checking the process name, we can detect the situation where the user calls via the executable rather than the script. In this case, we would like to issue an error message and then terminate gracefully.

You can use one of the PyErr_Set* methods.
Exceptions are always checked after calling init_module_name().
It's not explicitly stated in the documentation, but if you look at the examples, or if you read the source, you'll see that it is true.

Examine Object at Given Memory Address

Given a typical error message thrown by the python interpreter:
TypeError: <sqlalchemy.orm.dynamic.AppenderBaseQuery object at 0x3506490> is not JSON serializable
Can I use that memory address to find the offending object using the python shell?

No, you can't. The only purpose of that address is to identify the object for debugging purposes.

If you really, really want to, it's not impossible. Just hard, and a very bad idea.
In CPython, you can use ctypes to convert a number into a pointer to any type you want. And to load and call functions out of sys.executable (and/or the so/dll/framework where the actual code is) just like any other library. And to define structures that match the C API structures.
If you're really careful, you'll get a quick segfault instead of corrupting everything all to hell. If you're really, really careful, you can occasionally pull off some unsavory hacks without even segfaulting.
However, in this case, it's unlikely to do you any good. Sure, at some point there was a sqlalchemy.orm.dynamic.AppenderBaseQuery object at 0x3506490… but as soon as that object went out of scope, it probably got released, so there may be anything at that location…

Get the parent message of a protobuf message (in python)

Is there any officially supported way to get the parent message for a given ProtoBuf message in Python? The way the Python protobuf interface is designed, we are guaranteed that each message will have at most one parent. It would be nice to be able to navigate from a message to its parent without building an external index.
Clearly, this information is present, and I can use the following code to get a weak pointer to the parent of any given message:
>>> my_parent = my_message._listener._parent_message_weakref
However, this uses internal attributes -- I would much rather use officially supported methods if possible.
If there is no officially supported way to do this, then I'll need to decide whether to build an external child→parent index (which could hurt performance), or to use this "hackish" method (appropriately wrapped).

After looking into this further (reading the source code), it's clear that there's no officially supported way to do this in Python.

Would optional static typing benefit Python API-design or be a disadvantage? (type checking decorator example included)

I'm a long time Python developer and I really love the dynamic nature of the language, but I wonder if Python would benefit from optional static typing.
Would it be beneficial to be able to apply static typing to the API of a library, and what would the disadvantages of this be?
I quickly sketched up a decorator implementing runtime-static type checking on pastebin and it works like this:
# A TypeError will be thrown if the argument "string" is not a "str" and if
# the returned value is not an "int"
#typed(int, string = str)
def getStringLength(string):
return len(string)
Would it be practical to use a decorator like this on the API-functions of a library? In my point of view type checking is not needed in the internal workings of a domain specific module of a library, but on the connection points between the library and it's client a simple version of design by contract by applying type checking could be useful. Especially as a type of enforced documentation which clearly states to the client of the library what it expects and returns.
Like this example where addObjectToQueue() and isObjectProcessed() are exposed for use by the client and processTheQueueAndDoAdvancedStuff() is an internal library function. I think type checking could be useful on the outward facing functions but would only bloat and restrict the dynamicness and usefulness of python if used on the internal functions.
# some_library_module.py
#typed(int, name = string)
def addObjectToQueue(name):
return random.randint() # Some object id
def processTheQueueAndDoAdvancedStuff(arg_of_library_specific_type)
# Function body here
#typed(bool, object_id = int)
def isObjectProcessed(object_id):
return True
What would the disadvantages of using this technique be?
What would the disadvantages of my naive implementation on pastebin be?
I don't want answers discussing the conversion of Python to a statically typed language, but thoughts about API design-specific pros/cons. (please move this to programmers.stackexchange.com if you consider it not a question)

Personally, I don't find this idea attractive for Python. This is all just my opinion, of course, but for context I'll tell you that Python and Haskell are probably my two favourite programming languages - I like languages at both extreme ends of the static vs dynamic typing spectrum.
I see the main benefits of static typing as follows:
Increased likelihood that your code is correct once the compiler has accepted it; if I know I've threaded my values through all the operations I invoked in such a way that the result type of one always matches the input type of another, and the final result type is the one I wanted, it increases the probability that I've selected the correct operations. This point is of deeply arguable value, since it only really matters if you're not testing very much, which would be bad. But it is true that, when programming in Haskell, when I sit back and say "there, done!" I am actually done a lot of the time, whereas that's almost never true of my Python code.
The compiler automatically points out most of the places that need changing when I make an incompatible change to a data structure or interface (most of the time). Again, tests are still needed to actually be sure you've caught all the implications, but most of the time the compiler's nagging is actually sufficient, in my experience, which deeply simplifies such refactoring; you can go straight from implementing the core of the refactoring to testing that the program still works okay, because the actual work of making all the flow-on changes is almost mechanical.
Efficient implementation. The compiler gets to use all the knowledge it has about types to do optimisation.
Your suggested system doesn't really provide any of these benefits.
Having written a program making use of your library, I still don't know if it contains any type-incorrect uses of your functions until I do extensive testing with full code coverage to see if any execution path contains a bad call.
When I refactor something, I need to go through many many rounds of "run full test suite, look for exception, find where it came from, fix the code" to get anything at all like a static-typing compiler's problem detection.
Python will still be behaving as if those variables could be anything at any time.
And to get even that much, you've sacrificed the flexibility of Python duck-typing; it's not enough that I provide a sufficiently "list-like" object, I have to actually provide a list.
To me, this sort of static typing is the worst of both worlds. The main dynamic typing argument is "you have to test your code anyway, so you may as well use those tests to catch type errors and free yourself from having to work around the type system when it doesn't help you". That may or may not be a good argument with respect to a really good static type system, but it absolutely is a compelling argument with respect to a weak partial static type system that only detects type errors at runtime. I don't think nicer error messages (which is all it really buys you most of the time; a type error not caught at the interface is almost certainly going to throw an exception deeper in the call stack) is worth the loss of flexibility.

what does the last argument to SWIG_NewPointerObj mean?

I have a compatibility library that uses SWIG to access a C++ library. I would find it useful to be able to create a SWIG-wrapped Python object inside this layer (as opposed to accepting the C++ object as an argument or returning one). I.e. I want the PyObject* that points to the SWIG-wrapped C++ object.
I discovered that the SWIG_NewPointerObj function does exactly this. The SWIG-generated xx_wrap.cpp file uses this function, but it's also made available in the header emitted by swig -python -external-runtime swigpyrun.h
HOWEVER, I cannot find any reference to what the last argument to this function is. It appears that it specifies the ownership of the object, but there is no documentation that says what each of the options mean (or even what they all are).
It appears that the following are acceptable values:
0
SWIG_POINTER_OWN
SWIG_POINTER_NOSHADOW
SWIG_POINTER_NEW = OWN + NOSHADOW
SWIG_POINTER_DISOWN (I'm not sure if SWIG_NewPointerObj accepts this)
SWIG_POINTER_IMPLICIT_CONV (I'm not sure if SWIG_NewPointerObj accepts this)
I want to create an object that is used only in my wrapping layer. I want to create it out of my own pointer to the C++ object (so I can change the C++ object's value and have it be reflected in the Python object. I need it so it can be passed to a Python callback function. I want to keep this one instance throughout the life of the program so that I don't waste time creating/destroying identical objects for each callback. Which option is appropriate, and what do I Py_INCREF?

When you create new pointer objects with SWIG_NewPointerObj, you may pass the following flags:
SWIG_POINTER_OWN
SWIG_POINTER_NOSHADOW
If SWIG_POINTER_OWN is set, the destructor of the underlying C++ class will be called when the Python pointer is finalized. By default, the destructor will not be called. See Memory Management
For your use case, you don't need to set any flags at all.
From what I can see in the sources, if SWIG_POINTER_NOSHADOW is set, then a basic wrapped pointer is returned. You will not be able to access member variables in Python. All you'll have is an opaque pointer.
Reference: /usr/share/swig/2.0.7/python/pyrun.swg

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Swig / Python memory leak detected - python

The error message is pretty clear to me, you need to define a destructor for this type.

Related

How should I handle an error in Python 2.7 in the initialization method for a C extension?

Examine Object at Given Memory Address

Get the parent message of a protobuf message (in python)

Would optional static typing benefit Python API-design or be a disadvantage? (type checking decorator example included)

what does the last argument to SWIG_NewPointerObj mean?

Categories

Resources