I am wondering why numpy.zeros takes up such little space?
x = numpy.zeros(200000000)
This takes up no memory while,
x = numpy.repeat(0,200000000)
takes up around 1.5GB. Does numpy.zeros create an array of empty pointers?
If so, is there a way to set the pointer back to empty in the array after changing it in cython? If I use:
x = numpy.zeros(200000000)
x[0:200000000] = 0.0
The memory usage goes way up. Is there a way to change a value, and then change it back to the format numpy.zeros originally had it in python or cython?
Are you using Linux? Linux has lazy allocation of memory. The underlying calls to malloc and calloc in numpy always 'succeed'. No memory is actually allocated until the memory is first accessed.
The zeros function will use calloc which zeros any allocated memory before it is first accessed. Therfore, numpy need not explicitly zero the array and so the array will be lazily initialised. Whereas, the repeat function cannot rely on calloc to initialise the array. Instead it must use malloc and then copy the repeated to all elements in the array (thus forcing immediate allocation).
Related
The Python C API includes a function, PyBuffer_IsContiguous for verifying if the returned buffer is contiguous.
How to construct an object such as that function returns a false value?
Here are a few examples that do not work (all tried with Python 3.10):
a = b"123"
b = b"456"
c = bytearray(a[0:1] + b[0:1])
# This one returns a contiguous buffer even if id(b[0]) == id(c[1])
# Who does the copying? Why? Any rules?
np.zeros((2, 3)).T
# PyObject_GetBuffer fails instead of returning a strided array
memoryview(b"123")[::2]
# PyObject_GetBuffer fails instead of returning a strided array
c = bytearray(a[0:1] + b[0:1])
# This one returns a contiguous buffer even if id(b[0]) == id(c[1])
# Who does the copying? Why? Any rules?
There are actually many different copied copies. a[0:1] and b[0:1] return new (i.e. copied) bytes objects, not views on the existing bytes objects. ... + ... also returns a new bytes object containing the combined contents. bytearray then also returns a new bytearray object with its own internal memory store.
id(b[0]) == id(c[1]) tells you nothing - b[0] and c[1] return a reference to a Python int. That Python int does not share memory with the bytes object that generated it. What you're seeing is the results of an optimization - Python keep a cache of small ints and returns a reference to an object from that cache rather than creating a new one. So you get multiple references to the same int.
Your other two suggestions are non-contiguous arrays. As suggested in the comments: PyObject_GetBuffer has a flags variable that controls what type of buffer it can return (and it'll raise an error it can't produce the right type of buffer). The documentation has three tables detailing the various flags, but largely anything with strides marked as "YES" should be able to handle a non-contiguous array. Your suggestion (in the comments) of PyBUF_STRIDES is probably the simplest thing that will.
I am working with very large numpy/scipy arrays that take up a huge junk of memory. Suppose my code looks something like the following:
def do_something(a):
a = a / a.sum() #new memory is allocated
#I don't need the original a now anylonger, how to delete it?
#do a lot more stuff
#a = super large numpy array
do_something(a)
print a #still the same as originally (as passed by value)
So I am calling a function with a huge numpy array. The function then processes the array in some way or the other, but the original object is still kept in memory. Is there any way to free the memory inside the function; deleting the reference does not work.
What you want cannot be done; Python will only free the memory when all references to the array object are gone, and you cannot delete the a reference in the calling namespace from the function.
Instead, break up your problem into smaller steps. Do your calculations on a with one function, delete a then, then call another function to do the rest of the work.
Python works with a simple GC algorithm, basically it has a reference counting (it has a generational GC too, but that's not the case), that means that every reference to the object increment a counter, and every object out of scope decrement the scope.
The memory is deallocated only after the counter reach 0.
so while you've a reference to that object, it'll keep on memory.
In your case the caller of do_something still have a reference to the object, if you want that this variable gone you can reduce the scope of that variable.
If you suspect of memory leaks you can set the DEBUG_LEAK flag and inspect the output, more info here: https://docs.python.org/2/library/gc.html
I'm kind of new (1 day) to Python so maybe my question is stupid. I've already looked here but I can't find my answer.
I need to modify the content of an array at a random offset with a random size.
I have a Python API to interface a DDL for an USB device which I can't modify. There is a function just like this one :
def read_device(in_array):
# The DLL accesses the USB device, reads it into in_array of type array('B')
# in_array can be an int (the function will create an array with the int
# size and return it), or can be an array or, can be a tuple (array, int)
In MY code, I create an array of, let's say, 64 bytes and I want to read 16 bytes starting from the 32rd byte. In C, I'd give &my_64_array[31] to the read_device function.
In python, if a give :
read_device(my_64_array[31:31+16])
it seems that in_array is a reference to a copy of the given subset, therefore my_64_array is not modified.
What can I do ? Do I have to split my_64_array and recombine it after ??
Seeing as how you are not able to update and/or change the API code. The best method is to pass the function a small temporary array that you then assign to your existing 64 byte array after the function call.
So that would be something like the following, not knowing the exact specifics of your API call.
the_64_array[31:31+16] = read_device(16)
It's precisely as you say, if you input a slice into a function it creates a reference copy of the slice.
Two possible methods to add it later (assuming read_device returns the relevant slice):
my_64_array = my_64_array[:32] + read_device(my_64_array[31:31+16]) + my_64_array[31+16:]
# equivalently, but around 33% faster for even small arrays (length 10), 3 times faster for (length 50)...
my_64_array[31:31+16] = read_device(my_64_array[31:31+16])
So I think you should be using the latter.
.
If it was a modifiable function (but it's not in this case!) you could be to change your functions arguments (one is the entire array):
def read_device(the_64_array, start=31, end=47):
# some code
the_64_array[start:end] = ... #modifies `in_array` in place
and call read_device(my_64_array) or read(my_64_array, 31, 31+16).
When reading a list subset you're calling __getitem__ with a slice(x, y) argument of that list. In your case these statements are equal:
my_64_array[31:31+16]
my_64_array.__getitem__(slice(31, 31+16))
This means that the __getitem__ function can be overridden in a subclass to obtain different behaviour.
You can also set the same subset using a[1:3] = [1,2,3] in which case it'd call a.__setitem__(slice(1, 3), [1,2,3])
So I'd suggest either of these:
pass the list (my_64_array) and a slice object to read_device instead of passing the result of __getitem__, after which you could read the necessary data and set the corresponding offsets. No subclassing. This is probably the best solution in terms of readability and ease of development.
subclassing list, overriding __getitem__ and __setitem__ to return instances of that subclass with a parent reference, and then change all modifying or reading methods of a list to reference a parent list instead. This might be a little tricky if you're new to python, but basically, you'd exploit that python list properties are largely defined by the methods inside a list instance. This is probably better in terms of performance as you can create references.
If read_device returns the resulting list, and that list is of equal size, you can do this: a[x:y] = read_device(a[x:y])
I have a C function that mallocs() and populates a 2D array of floats. It "returns" that address and the size of the array. The signature is
int get_array_c(float** addr, int* nrows, int* ncols);
I want to call it from Python, so I use ctypes.
import ctypes
mylib = ctypes.cdll.LoadLibrary('mylib.so')
get_array_c = mylib.get_array_c
I never figured out how to specify argument types with ctypes. I tend to just write a python wrapper for each C function I'm using, and make sure I get the types right in the wrapper. The array of floats is a matrix in column-major order, and I'd like to get it as a numpy.ndarray. But its pretty big, so I want to use the memory allocated by the C function, not copy it. (I just found this PyBuffer_FromMemory stuff in this StackOverflow answer: https://stackoverflow.com/a/4355701/3691)
buffer_from_memory = ctypes.pythonapi.PyBuffer_FromMemory
buffer_from_memory.restype = ctypes.py_object
import numpy
def get_array_py():
nrows = ctypes.c_int()
ncols = ctypes.c_int()
addr_ptr = ctypes.POINTER(ctypes.c_float)()
get_array_c(ctypes.byref(addr_ptr), ctypes.byref(nrows), ctypes.byref(ncols))
buf = buffer_from_memory(addr_ptr, 4 * nrows * ncols)
return numpy.ndarray((nrows, ncols), dtype=numpy.float32, order='F',
buffer=buf)
This seems to give me an array with the right values. But I'm pretty sure it's a memory leak.
>>> a = get_array_py()
>>> a.flags.owndata
False
The array doesn't own the memory. Fair enough; by default, when the array is created from a buffer, it shouldn't. But in this case it should. When the numpy array is deleted, I'd really like python to free the buffer memory for me. It seems like if I could force owndata to True, that should do it, but owndata isn't settable.
Unsatisfactory solutions:
Make the caller of get_array_py() responsible for freeing the memory. That's super annoying; the caller should be able to treat this numpy array just like any other numpy array.
Copy the original array into a new numpy array (with its own, separate memory) in get_array_py, delete the first array, and free the memory inside get_array_py(). Return the copy instead of the original array. This is annoying because it's an ought-to-be unnecessary memory copy.
Is there a way to do what I want? I can't modify the C function itself, although I could add another C function to the library if that's helpful.
I just stumbled upon this question, which is still an issue in August 2013. Numpy is really picky about the OWNDATA flag: There is no way it can be modified on the Python level, so ctypes will most likely not be able to do this. On the numpy C-API level - and now we are talking about a completely different way of making Python extension modules - one has to explicitly set the flag with:
PyArray_ENABLEFLAGS(arr, NPY_ARRAY_OWNDATA);
On numpy < 1.7, one had to be even more explicit:
((PyArrayObject*)arr)->flags |= NPY_OWNDATA;
If one has any control over the underlying C function/library, the best solution is to pass it an empty numpy array of the appropriate size from Python to store the result in. The basic principle is that memory allocation should always be done on the highest level possible, in this case on the level of the Python interpreter.
As kynan commented below, if you use Cython, you have to expose the function PyArray_ENABLEFLAGS manually, see this post Force NumPy ndarray to take ownership of its memory in Cython.
The relevant documentation is here
and here.
I would tend to have two functions exported from my C library:
int get_array_c_nomalloc(float* addr, int nrows, int ncols); /* Pass addr as argument */
int get_array_c(float **addr, int nrows, int ncols); /* Calls function above */
I would then write my Python wrapper[1] of get_array_c to allocate the array, then call get_array_c_nomalloc. Then Python does own the memory. You could integrate this wrapper into your library so your user never has to be aware of get_array_c_nomalloc's existence.
[1] This isn't really a wrapper anymore, but instead is an adapter.
I have a numpy array that I would like to share between a bunch of python processes in a way that doesn't involve copies. I create a shared numpy array from an existing numpy array using the sharedmem package.
import sharedmem as shm
def convert_to_shared_array(A):
shared_array = shm.shared_empty(A.shape, A.dtype, order="C")
shared_array[...] = A
return shared_array
My problem is that each subprocess needs to access rows that are randomly distributed in the array. Currently I create a shared numpy array using the sharedmem package and pass it to each subprocess. Each process also has a list, idx, of rows that it needs to access. The problem is in the subprocess when I do:
#idx = list of randomly distributed integers
local_array = shared_array[idx,:]
# Do stuff with local array
It creates a copy of the array instead of just another view. The array is quite large and manipulating it first before shareing it so that each process accesses a contiguous range of rows like
local_array = shared_array[start:stop,:]
takes too long.
Question: What are good solutions for sharing random access to a numpy array between python processes that don't involve copying the array?
The subprocesses need readonly access (so no need for locking on access).
Fancy indexing induces a copy, so you need to avoid fancy indexing if you want to avoid copies there is no way around it.