This question already has answers here:
Does Python have a stack/heap and how is memory managed?
(2 answers)
Closed 5 years ago.
Can anyone explain me how these python dictionary, list variables stored in memory. I do know that in python memory management is done using heap and stack. But I really couldn't find a simple explanation on how memory is allocated when dictionary variable is created, is it created in stack frame or heap space?
Well lets look at the source code to figure it out!
From: https://github.com/python/cpython/blob/master/Objects/dictobject.c
static PyObject *
new_dict(PyDictKeysObject *keys, PyObject **values)
{
PyDictObject *mp;
...
mp = PyObject_GC_New(PyDictObject, &PyDict_Type);
...
return (PyObject *)mp;
}
So a new dict object appears to be allocated using PyObject_GC_New()
From: https://github.com/python/cpython/blob/master/Doc/c-api/gcsupport.rst#id9
.. c:function:: TYPE* PyObject_GC_New(TYPE, PyTypeObject *type)
Analogous to :c:func:`PyObject_New` but for container objects with the
:const:`Py_TPFLAGS_HAVE_GC` flag set.
From: https://github.com/python/cpython/blob/master/Objects/object.c
PyObject *
_PyObject_New(PyTypeObject *tp)
{
PyObject *op;
op = (PyObject *) PyObject_MALLOC(_PyObject_SIZE(tp));
if (op == NULL)
return PyErr_NoMemory();
return PyObject_INIT(op, tp);
}
From: https://github.com/python/cpython/blob/master/Objects/obmalloc.c
#define MALLOC_ALLOC {NULL, _PyMem_RawMalloc, _PyMem_RawCalloc, _PyMem_RawRealloc, _PyMem_RawFree}
#ifdef WITH_PYMALLOC
# define PYMALLOC_ALLOC {NULL, _PyObject_Malloc, _PyObject_Calloc, _PyObject_Realloc, _PyObject_Free}
#endif
#define PYRAW_ALLOC MALLOC_ALLOC
#ifdef WITH_PYMALLOC
# define PYOBJ_ALLOC PYMALLOC_ALLOC
#else
# define PYOBJ_ALLOC MALLOC_ALLOC
static PyMemAllocatorEx _PyObject = PYOBJ_ALLOC;
...
void *
PyObject_Malloc(size_t size)
{
/* see PyMem_RawMalloc() */
if (size > (size_t)PY_SSIZE_T_MAX)
return NULL;
return _PyObject.malloc(_PyObject.ctx, size);
}
I think its safe to assume at this point that these will call malloc, calloc, realloc, and free.
At this point, this is no longer a python question, but the answer is it is dependent on the OS as to whether malloc will allocate on the stack or the heap.
Related
I'm chasing a memory leak that seems to come from a long-running process which contains a C extension that I wrote. I've been poring over the code and the Extensions docs and I'm sure it's correct but I'd like to make sure regarding the reference handling of PyList and PyDict.
From the docs I gather that PyDict_SetItem() borrows references to both key and value, hence I have to DECREF them after inserting. PyList_SetItem() and PyTuple_SetItem() steal a reference to the inserted item so I don't have to DECREF. Correct?
Creating a dict:
PyObject *dict = PyDict_New();
if (dict) {
for (i = 0; i < length; ++i) {
PyObject *key, *value;
key = parse_string(ctx); /* returns a PyString */
if (key) {
value = parse_object(ctx); /* returns some PyObject */
if (value) {
PyDict_SetItem(dict, key, value);
Py_DECREF(value); /* correct? */
}
Py_DECREF(key); /* correct? */
}
if (!key || !value) {
Py_DECREF(dict);
dict = NULL;
break;
}
}
}
return dict;
Creating a list:
PyObject *list = PyList_New(length);
if (list) {
PyObject *item;
for (i = 0; i < length; ++i) {
item = parse_object(ctx); /* returns some PyObject */
if (item) {
PyList_SetItem(list, i, item);
/* No DECREF here */
} else {
Py_DECREF(list);
list = NULL;
break;
}
}
}
return list;
The parse_* function don't need extra scrutiny: They only create objects on their last line like this (for example):
return PyLong_FromLong(...);
If they encounter an error, they don't create any object but set an exception earlier in the function body:
return PyErr_Format(...);
EDIT
Here's some output from valgrind --leak-check=full. Clearly it is my code leaking memory, but why? Why is PyDict_New is at the top of the (recursive) chain? Does that mean that the dict created here doesn't get DECREF'd when the whole thing is garbage collected?
Just to be clear here: When I build a nested data structure of Python types in C and then DECREF the topmost instance, Python will recursively DECREF all the contents of the structure, won't it?
==4357== at 0x4C29BE3: malloc (vg_replace_malloc.c:299)
==4357== by 0x4F20DBC: PyObject_Malloc (in /usr/lib64/libpython3.6m.so.1.0)
==4357== by 0x4FC0F98: _PyObject_GC_Malloc (in /usr/lib64/libpython3.6m.so.1.0)
==4357== by 0x4FC102C: _PyObject_GC_New (in /usr/lib64/libpython3.6m.so.1.0)
==4357== by 0x4F11EC0: PyDict_New (in /usr/lib64/libpython3.6m.so.1.0)
==4357== by 0xE5821BA: parse_dict (parser.c:350)
==4357== by 0xE581987: parse_object (parser.c:675)
==4357== by 0xE5821F0: parse_dict (parser.c:358)
==4357== by 0xE581987: parse_object (parser.c:675)
==4357== by 0xE5823CE: parse (parser.c:727)
Forgot to Py_DECREF(item) after PyList_Append(list, item) in a seemingly unrelated piece of code. PyList_SetItem() steals references, PyList_Append() doesn't.
I'm venturing into C extensions for the first time, and am somewhat new to C as well. I've got a working C extension, however, if i repeatedly call the utility in python, I eventually get a segmentation fault: 11.
#include <Python.h>
static PyObject *getasof(PyObject *self, PyObject *args) {
PyObject *fmap;
long dt;
if (!PyArg_ParseTuple(args, "Ol", &fmap, &dt))
return NULL;
long length = PyList_Size(fmap);
for (int i = 0; i < length; i++) {
PyObject *event = PyList_GetItem(fmap, i);
long dti = PyInt_AsLong(PyList_GetItem(event, 0));
if (dti > dt) {
PyObject *output = PyList_GetItem(event, 1);
return output;
}
}
Py_RETURN_NONE;
};
The function args are
a time series (list of lists): ex [[1, 'a'], [5, 'b']]
a time (long): ex 4
And it's supposed to iterate over the list of lists til it finds a value greater than the time given. Then return that value. As I mentioned, it correctly returns the answer, but if I call it enough times, it segfaults.
My gut feeling is that this has to do with reference counting, but I'm not familiar enough with the concept to know if this is the direct cause.
Any help would be appreciated.
"My gut feeling is that this has to do with reference counting..." Your instincts are correct.
PyList_GetItem returns a borrowed reference, which means your function doesn't "own" a reference to the item. So there is a problem here:
PyObject *output = PyList_GetItem(event, 1);
return output;
You don't own a reference to the item, but you return it to the caller, so the caller doesn't own a reference either. The caller will run into a problem if the item is garbage collected while the caller is still trying to use it. So you'll need to increase the reference count of the item before you return it:
PyObject *output = PyList_GetItem(event, 1);
Py_INCREF(output);
return output;
That assumes that PyList_GetItem(event, 1) doesn't fail! Except for PyArg_ParseTuple, you aren't checking the return values of the C API functions, which means you are assuming the input argument always has the exact structure that you expect. That's fine while you're testing code and figuring out how this works, but eventually you should be checking the return values of the C API functions for failure, and handling it appropriately.
I've written a Python C++ extension, however I have a problem with one of its functions.
The function provided by this extension takes 2 arrays as inputs and produces one as an output.
I've only left the relevant part of function's code
float* forward(float* input, float* kernels, npy_intp* input_dims, npy_intp* kernels_dims){
float* output = new float[output_size];
//some irrelevant matrix operation code
return output;
}
And the wrapper:
static PyObject *module_forward(PyObject *self, PyObject *args)
{
PyObject *input_obj, *kernels_obj;
if (!PyArg_ParseTuple(args, "OO", &input_obj, &kernels_obj))
return NULL;
PyObject *input_array = PyArray_FROM_OTF(input_obj, NPY_FLOAT, NPY_IN_ARRAY);
PyObject *kernels_array = PyArray_FROM_OTF(kernels_obj, NPY_FLOAT, NPY_IN_ARRAY);
if (input_array == NULL || kernels_array == NULL) {
Py_XDECREF(input_array);
Py_XDECREF(kernels_array);
return NULL;
}
float *input = (float*)PyArray_DATA(input_array);
float *kernels = (float*)PyArray_DATA(kernels_array);
npy_intp *input_dims = PyArray_DIMS(input_array);
npy_intp *kernels_dims = PyArray_DIMS(kernels_array);
/////////THE ACTUAL FUNCTION
float* output = forward(input, kernels, input_dims, kernels_dims);
Py_DECREF(input_array);
Py_DECREF(kernels_array);
npy_intp output_dims[4] = {input_dims[0], input_dims[1]-kernels_dims[0]+1, input_dims[2]-kernels_dims[1]+1, kernels_dims[3]};
PyObject* ret_output = PyArray_SimpleNewFromData(4, output_dims, NPY_FLOAT, output);
delete output;//<-----THE PROBLEMATIC LINE////////////////////////////
PyObject *ret = Py_BuildValue("O", ret_output);
Py_DECREF(ret_output);
return ret;
}
The delete operator that I highlighted is where the magic happens: without it this function leaks memory, with it it crashes because of memory access violation.
The fun thing is I wrote another method, that returns two arrays. So the function returns a float** pointing to two float* elements:
float** gradients = backward(input, kernels, grads, input_dims, kernel_dims, PyArray_DIMS(grads_array));
Py_DECREF(input_array);
Py_DECREF(kernels_array);
Py_DECREF(grads_array);
PyObject* ret_g_input = PyArray_SimpleNewFromData(4, input_dims, NPY_FLOAT, gradients[0]);
PyObject* ret_g_kernels = PyArray_SimpleNewFromData(4, kernel_dims, NPY_FLOAT, gradients[1]);
delete gradients[0];
delete gradients[1];
delete gradients;
PyObject* ret_list = PyList_New(0);
PyList_Append(ret_list, ret_g_input);
PyList_Append(ret_list, ret_g_kernels);
PyObject *ret = Py_BuildValue("O", ret_list);
Py_DECREF(ret_g_input);
Py_DECREF(ret_g_kernels);
return ret;
Notice that the second example works flawlessly, no crashes or memory leaks, while still calling delete on arrays after they have been built into PyArray objects.
Could someone enlighten me about what's going on in here?
From the PyArray_SimpleNewFromData docs:
Create an array wrapper around data pointed to by the given pointer.
If you create an array with PyArray_SimpleNewFromData, it's going to create a wrapper around the data you give it, rather than making a copy. That means the data it wraps has to outlive the array. delete-ing the data violates that.
You have several options:
You could create the array differently so you don't just make a wrapper around the original data.
You could carefully control access to the array and make sure its lifetime ends before you delete the data.
You could create a Python object that owns the data and will delete the data when the object's lifetime ends, and set the array's base to that object with PyArray_SetBaseObject, so the array keeps the owner object alive until the array itself dies.
I am moderately experienced in python and C but new to writing python modules as wrappers on C functions. For a project I needed one function named "score" to run much faster than I was able to get in python so I coded it in C and literally just want to be able to call it from python. It takes in a python list of integers and I want the C function to get an array of integers, the length of that array, and then return an integer back to python. Here is my current (working) solution.
static PyObject *module_score(PyObject *self, PyObject *args) {
int i, size, value, *gene;
PyObject *seq, *data;
/* Parse the input tuple */
if (!PyArg_ParseTuple(args, "O", &data))
return NULL;
seq = PySequence_Fast(data, "expected a sequence");
size = PySequence_Size(seq);
gene = (int*) PyMem_Malloc(size * sizeof(int));
for (i = 0; i < size; i++)
gene[i] = PyInt_AsLong(PySequence_Fast_GET_ITEM(seq, i));
/* Call the external C function*/
value = score(gene, size);
PyMem_Free(gene);
/* Build the output tuple */
PyObject *ret = Py_BuildValue("i", value);
return ret;
}
This works but seems to leak memory and at a rate I can't ignore. I made sure that the leak is happening in the shown function by temporarily making the score function just return 0 and still saw the leaking behavior. I had thought that the call to PyMem_Free should take care of the PyMem_Malloc'ed storage but my current guess is that something in this function is getting allocated and retained on each call since the leaking behavior is proportional to the number of calls to this function. Am I not doing the sequence to array conversion correctly or am I possibly returning the ending value inefficiently? Any help is appreciated.
seq is a new Python object so you will need delete that object. You should check if seq is NULL, too.
Something like (untested):
static PyObject *module_score(PyObject *self, PyObject *args) {
int i, size, value, *gene;
long temp;
PyObject *seq, *data;
/* Parse the input tuple */
if (!PyArg_ParseTuple(args, "O", &data))
return NULL;
if (!(seq = PySequence_Fast(data, "expected a sequence")))
return NULL;
size = PySequence_Size(seq);
gene = (int*) PyMem_Malloc(size * sizeof(int));
for (i = 0; i < size; i++) {
temp = PyInt_AsLong(PySequence_Fast_GET_ITEM(seq, i));
if (temp == -1 && PyErr_Occurred()) {
Py_DECREF(seq);
PyErr_SetString(PyExc_ValueError, "an integer value is required");
return NULL;
}
/* Do whatever you need to verify temp will fit in an int */
gene[i] = (int*)temp;
}
/* Call the external C function*/
value = score(gene, size);
PyMem_Free(gene);
Py_DECREF(seq):
/* Build the output tuple */
PyObject *ret = Py_BuildValue("i", value);
return ret;
}
I have this class with an int and int[2] members, and I have a getMember accessor method that takes an index of a member and void* and fills the (pre-allocated) space after void* with the member:
foobar.h:
class Foobar {
public:
void getMember(int index, void* data) {
switch (index) {
case 0:
*(int *) data = member0;
break;
case 1:
*(int*) data = member1[0];
*((int*) data + 1) = member1[1];
break;
}
}
int member0;
int member1[2];
};
I can then write a SWIG interface to this:
%{
#include "foobar.h"
%}
%include "foobar.h"
Now, if I also add
%include <cpointer.i>
%pointer_functions(int, intp)
I can then do the following in Python:
>>>p = new_intp()
>>>f = Foobar()
>>>f.member0 = 2
>>>f.getMember(0, p)
>>>intp_value(p)
2
Question 1. I have a void* declared and I am passing intp and yet the whole thing works. Why??
Question 2. Assuming you explain to me how the above works, then how do I accomplish the same for member1 ?? That is, I added the pointer_functions code to make the above work (magically). Then what similar thing I need to add and what pointer p1 to pass so that
>>>f.getMember(1, p1)
works?
Well, I still can't answer Question 1, nevertheless I found the new "magic" way to answer Question 2:
%include <carrays.i>
%array_functions(int, inta);
Now Question 1 is unchanged, and Question 2 becomes, why does it work?
I'd have to check the code generated by SWIG, but my guess is that for void* function parameter declaration, SWIG can't do any type checking so any type given to the function will be accepted and passed on to the function. Then in the code, you cast the void* to an int* so as long as the type really is an int*, all is good. If you passed something not an int* then you would get undefined behavior such as crash since you would be overwriting part of an object or, worse, a short or such.
So this should tell you that what you are doing is rather dangerous. I don't see why you can declare your function to take an int* since the pointer can refer to one item or an array:
void getMember(int index, int* data) {
switch (index) {
case 0:
*data = member0;
break;
case 1:
*data = member1[0];
*(data + 1) = member1[1];
break;
}
}
Then SWIG will generate code that will check that the passed-in type is int* and will throw otherwise. I don't know if SWIG will know that your inta type is compatible with intp type. If not, you could %extend Foo with an adapter in your .i file:
%extend Foobar %{
void getMember(int index, int[2] data) {
getMember(index, data); // C++ knows this is ok
}
}