I have a Python program which calls some Cython code which in turn wraps some raw C++ code. I would like to see how much memory the base C++ code is allocating. I've tried the memory_profiler module for Python, however, it can't seem to detect anything allocated by the C++ code. My evidence for this is that I have a Cython object that in turn stores an instance of a C++ object. This C++ object should definitely hold onto a bunch of memory. In python, when I create an instance of the Cython object (and it stores an instance of the C++ object), memory_profiler does not detect any extra memory stored (or at least detects only a negligible amount).
Is there any other way to detect how much memory Python is having allocated by the base C++ objects? Or is there something similar to memory_profiler, but for Cython?
If you can run your program on Linux, use https://github.com/vmware/chap (for example, start with "summarize used").
Related
I'm wondering if there is any memory overhead of using C++ classes/structs wrapped by pybind11.
Let's consider a simple example:
struct Person {
std::string name;
int age;
}
// With some basic bindings
pybind11::class_<Person>(m, "Person")
.def_readwrite("name", &Person::name)
.def_readwrite("age", &Person::age);
In addition, there is a C++ function that returns millions of persons via a std::vector<Person>.
Technically, it is easy to add a pybind11 binding for the function, but is it a good idea to so?
Wrapping the function returns a Python list of person instances.
In general in Python it is inefficient to have a large number of tiny objects, because of memory and GC overheads. The typical solution in Python is to opt for columnar memory layouts, but do these worries apply for classes/structs wrapped by pybind11 as well?
Specifically: If the function returns 1 million elements, will pybind11 internally create another 1 million wrapper instances or do the bindings operate directly on the C++ objects without any overhead?
Does the type of the members matter?
The pybind documentation says that it copies structures every time in binding. That means that these structures and containers are independent in Python and C++, so changes of data in C++ container will not reflect in Python (no references). Also it means that it will duplicate data in C++ and Python - 1 million elements in C++ container and 1 million elements in Python.
See here - https://pybind11.readthedocs.io/en/stable/advanced/cast/stl.html
Below is the program that defines a function within another function.
1) When we say python program.py Does every line of python source directly gets converted to set of machine instructions that get executed on processor?
2) Above diagram has GlobalFrame and LocalFrame and Objects. In the above program, Where does Frames Objects and code reside in runtime? Is there a separate memory space given to this program within python interpreter's virtual memory address space?
"Does every line of python source directly gets converted to set of machine instructions that get executed on processor?"
No. Python code (not necessarily by line) typically gets converted to an intermediate code which is then interpreted by what some call a "virtual machine" (confusingly, as VM means something completely different in other contexts, but ah well). CPython, the most popular implementation (which everybody thinks of as "python":-), uses its own bytecode and interpreter thereof. Jython uses Java bytecode and a JVM to run it. And so on. PyPy, perhaps the most interesting implementation, can emit almost any sort of resulting code, including machine code -- but it's far from a line by line process!-)
"Where does Frames Objects and code reside in runtime"
On the "heap", as defined by the malloc, or equivalent, in the C programming language in the CPython implementation (or Java for Jython, etc, etc).
That is, whenever a new PyObject is made (in CPython's internals), a malloc or equivalent happens and that object is forevermore referred via a pointer (a PyObject*, in C syntax). Functions, frames, code objects, and so forth, almost everything is an object in Python -- no special treatment, "everything is first-class"!-)
I have a problem with a project I am working on and am not sure about the best way to
resolve it.
Basically I am pushing a slow python algorithm into a c++ shared library that I am using to do a lot of the numerically intense stuff. One of the c++ functions is of the form:
const int* some_function(inputs){
//does some stuff
int *return_array = new int[10];
// fills return array with a few values
return return_array;
}
I.e returns an array here. This array is interpreted within python using numpy ndpointer as per:
lib.some_function.restype = ndpointer(dtype=c_int, shape=(10,))
I have a couple of questions that I have been fretting over for a while:
1) I have dynamically allocated memory here. Given that I am calling this function through the shared library and into python, do I cause a memory leak? My program is long running and I will likely call this function millions of times, so this is important.
2) Is there a better data structure I can be using? If this was a pure c++ function I would return a vector, but from googling around, this seems to be a non- ideal solution in python with ctypes. I also have other functions in the c++ library that call this function. Given that I have just written the function and am about to write the others, I know to delete[] the returned pointer after use in these functions. However, I am unsatisfied with the current situation, as if someone other than myself (or indeed myself in a few months) uses this function, there is a relatively high chance of future memory leaks.
Thanks!
Yes, you are leaking memory. It is not possible for the Python code to automatically free the pointed-to memory (since it has no idea how it was allocated). You need to provide a corresponding de-allocation function (to call delete[]) and tell Python how to call it (possibly using a wrapper framework as recommended by #RichardHidges).
You probably want to consider using either SWIG or boost::python
There's an example of converting a std::vector to a python list using boost::python here:
std::vector to boost::python::list
here is the link for swig:
http://www.swig.org
I am experiencing a difficulty using boost python facilities to extend my C++ code to Python. I've written the boost.python wrappers successfully. I also have access to my C++ objects from Python without any error, in addition called a Python file (module) method from C++ using boost attr("")() function without any problem.
My problem is the execution time of the Python method. Referencing to the wrapped objects are about microseconds in Python code as I've printed. Although the time calling the Python method takes is about milliseconds and it increases with respect to the number of references I've made in the Python to my wrapped C++ objects (and only referencing/assigning not any further use). Thus I've made some search and my assumptions about this increasing time is:
some reference policies (default policies) causes this problem by doing some unnecessary operation(s) when returning from the Python code. So probably I'm doing something wrong in the wrappers.
Boost.Python call method has some overhead, which there might be some options I'm not aware of.
It worth mentioning that the Python method called in each execution cycle of my program and each time I get a very same (not exact) time.
I hope my description were enough. Below is also a part of my sample code:
One of my Wrappers:
class_<Vertex<> >("Vertex")
.def(init<float, float>())
.def_readwrite("x", &Vertex<>::x)
.def_readwrite("y", &Vertex<>::y)
.def("abs", &Vertex<>::abs)
.def("angle", &Vertex<>::angle)
.def(self - self)
.def(self -= self)
;
Calling a Python module method (which is "run"):
pyFile = import(fileName.c_str());
scope scope1(pyFile);
object pyNameSpace = scope1.attr("__dict__");
return extract<int>(pyFile.attr("run")());
I'm working on a Python application which uses a number of open source third-party libraries. One of the libraries is based on ctypes, and I recently found more than 10 separate memory leaks in it. The causes of these leaks ranged from circular references on objects with explicit destructors (which Python can't garbage collect) to using c_char_p as a return type for functions returning non-const character arrays (resulting in the character arrays being converted automatically to Python strings and the original C-allocated arrays never being freed.)
I fixed the leaks I found and submitted a pull request to the author of the library. I've done some extremely informal testing by creating and deleting objects in a loop and watching Python's memory usage as I do so, and I think I've found all the leaks. However, as I'm planning to use this library in an application that I'd like to open source and hopefully have a few other people use, I'd like to be more sure than that. So my question is: is there a systematic way to find memory leaks in ctypes-based libraries?
During the process of fixing the leaks I've already found, I tried Heapy and objgraph but neither were particularly useful for this purpose. As far as I can tell, both of them will only show objects allocated on the Python heap, so they're of no use in finding leaks caused by improper handling of heap space allocated by C libraries. Is there a tool I can use in Python that can show me allocations on the C heap, and preferably also which Python objects, if any, refer to the allocated addresses?
You could try running the application under Valgrind. Valgrind's a useful tool for profiling memory use in compiled applications. This will at least detect the links and report their source.
You will certainly get false positives from Python calls. Check out this site for a nice description of how to use suppressions, which allow you to specifically ignore certain types of errors. See also Python's premade list of suppressions (here), and a description of why they are needed (here).