How to "simulate" numpy.delete during processing

How to "simulate" numpy.delete during processing - python

For the sake of speeding up my algorithm that has numpy arrays with tens of thousands of elements, I'm wondering if I can reduce the time used by numpy.delete().
In fact, if I can just eliminate it?
I have an algorithm where I've got my array alpha.
And this is what I'm currently doing:
alpha = np.delete(alpha, 0)
beta = sum(alpha)
But why do I need to delete the first element? Is it possible to simply sum up the entire array using all elements except the first one? Will that reduce the time used in the deletion operation?

Avoid np.delete whenever possible. It returns a a new array, which means that new memory has to be allocated, and (almost) all the original data has to be copied into the new array. That's slow, so avoid it if possible.
beta = alpha[1:].sum()
should be much faster.
Note also that sum(alpha) is calling the Python builtin function sum. That's not the fastest way to sum items in a NumPy array.
alpha[1:].sum() calls the NumPy array method sum which is much faster.
Note that if you were calling alpha.delete in a loop, then the code may be deleting more than just the first element from the original alpha. In that case, as Sven Marnach points out, it would be more efficient to compute all the partial sums like this:
np.cumsum(alpha[:0:-1])[::-1]

Related

In python, whats the most efficient way to apply function to a list

my actual data is huge and quite heavy. But if I simplify and say if I have a list of numbers
x = [1,3,45,45,56,545,67]
and I have a function that performs some action on these numbers?
def sumnum(x):
return(x=np.sqrt(x)+1)
whats the best way apply this function to the list? I dont want to apply for loop. Would 'map' be the best option or anything faster/ efficient than that?
thanks,
Prasad

In standard Python, the map function is probably the easiest way to apply a function to an array (not sure about efficiency though). However, if your array is huge, as you mentioned, you may want to look into using numpy.vectorize, which is very similar to Python's built-in map function.
Edit: A possible code sample:
vsumnum = np.vectorize(sumnum)
x = vsumnum(x)
The first function call returns a function which is vectorized, meaning that numpy has prepared it to be mapped to your array, and the second function call actually applies the function to your array and returns the resulting array. Taken from the docs, this method is provided for convenience, not efficiency and is basically the same as a for loop
Edit 2:
As #Ch3steR mentioned, numpy also allows for elementwise operations to arrays, so in this case because you are doing simple operations, you can just do np.sqrt(x) + 1, which will add 1 to the square root of each element. Functions like map and numpy.vectorize are better for when you have more complicated operations to apply to an array

Best Way to Append C Arrays (or any arrays) in Cython

I have a function in an inner loop that takes two arrays and combines them. To get a feel for what it's doing look at this example using lists:
a = [[1,2,3]]
b = [[4,5,6],
[7,8,9]]
def combinearrays(a, b):
a = a + b
return a
def main():
print(combinearrays(a,b))
The output would be:
[[1, 2, 3], [4, 5, 6], [7, 8, 9]]
The key thing here is that I always have the same number of columns, but I want to append rows together. Also, the values are always ints.
As an added wrinkle, I cheated and created a as a list within a list. But in reality, it might be a single dimensional array that I want to still combine into a 2D array.
I am currently doing this using Numpy in real life (i.e. not the toy problem above) and this works. But I really want to make this as fast as possible and it seems like c arrays should be faster. Obviously one problem with c arrays if I pass them as parameters, is I won't know the actual number of rows in the arrays passed. But I can always add additional parameters to pass that.
So it's not like I don't have a solution to this problem using Numpy, but I really want to know what the single fastest way to do this is in Cython. Since this is a call inside an inner loop, it's going to get called thousands of times. So every little savings is going to count big.
One obvious idea here would be to use malloc or something like that.

While I'm not convinced this is the only option, let me recommend the simple option of building a standard Python list using append and then using np.vstack or np.concatenate at the end to build a full Numpy array.
Numpy arrays store all the data essentially contiguously in memory (this isn't 100% true for if you're taking slices, but for freshly allocated memory it's basically true). When you resize the array it may get lucky and have unallocated memory after the array and then be able to reallocate in place. However, in general this won't happen and the entire contents of the array will need to be copied to the new location. (This will likely apply for any solution you devise yourself with malloc/realloc).
Python lists are good for two reasons:
They are internally a list of PyObject* (in this case to the Numpy arrays it contains). If copying is needed during the resize you are only copying the pointers to the arrays, and not the whole arrays.
They are designed to handle resizing/appending intelligently by over-allocating the space needed, so that they need only re-allocate more memory occasionally. Numpy arrays could have this feature, but it's less obviously a good thing for Numpy than it is for Python lists (if you have a 10GB data array that barely fits in memory do you really want it over-allocated?)
My proposed solution uses the flexibly, easily-resized list class to build your array, and then only finalizes to the inflexible but faster Numpy array at the end, therefore (largely) getting the best of both.
A completely untested outline of the same structure using C to allocate would look like:
from libc.stdlib cimport malloc, free, realloc
cdef int** ptr_array = NULL
cdef int* current_row = NULL
# just to be able to return a numpy array
cdef int[:,::1] out
rows_allocated = 0
try:
for row in range(num_rows):
ptr_array = realloc(ptr_array, sizeof(int*)*(row+1))
current_row = ptr_array[r] = malloc(sizeof(int)*row_length)
rows_allocated = row+1
# fill in data on current_row
# pass to numpy so we can access in Python. There are other
# way of transfering the data to Python...
out = np.empty((rows_allocated,row_length),dtype=int)
for row in range(rows_allocated):
for n in range(row_length):
out[row,n] = ptr_array[row][n]
return out.base
finally:
# clean up memory we have allocated
for row in range(rows_allocated):
free(ptr_array[row])
free(ptr_array)
This is unoptimized - a better version would over-allocate ptr_array to avoid resizing each time. Because of this I don't actually expect it to be quick, but it's meant as an indication of how to start.

Quick way to access first element in Numpy array with arbitrary number of dimensions?

I have a function that I want to have quickly access the first (aka zeroth) element of a given Numpy array, which itself might have any number of dimensions. What's the quickest way to do that?
I'm currently using the following:
a.reshape(-1)[0]
This reshapes the perhaps-multi-dimensionsal array into a 1D array and grabs the zeroth element, which is short, sweet and often fast. However, I think this would work poorly with some arrays, e.g., an array that is a transposed view of a large array, as I worry this would end up needing to create a copy rather than just another view of the original array, in order to get everything in the right order. (Is that right? Or am I worrying needlessly?) Regardless, it feels like this is doing more work than what I really need, so I imagine some of you may know a generally faster way of doing this?
Other options I've considered are creating an iterator over the whole array and drawing just one element from it, or creating a vector of zeroes containing one zero for each dimension and using that to fancy-index into the array. But neither of these seems all that great either.

a.flat[0]
This should be pretty fast and never require a copy. (Note that a.flat is an instance of numpy.flatiter, not an array, which is why this operation can be done without a copy.)

You can use a.item(0); see the documentation at numpy.ndarray.item.
A possible disadvantage of this approach is that the return value is a Python data type, not a numpy object. For example, if a has data type numpy.uint8, a.item(0) will be a Python integer. If that is a problem, a.flat[0] is better--see #user2357112's answer.

np.hsplit(x, 2)[0]
Source: https://numpy.org/doc/stable/reference/generated/numpy.dsplit.html
Source:
https://numpy.org/doc/stable/reference/generated/numpy.hsplit.html

## y -- numpy array of shape (1, Ty)
if you want to get the first element:
use y.shape[0]
if you want to get the second element:
use y.shape[1]
Source:
https://docs.scipy.org/doc/numpy/reference/generated/numpy.take.html
You can also use the take for more complicated extraction (to get few elements):
numpy.take(a, indices, axis=None, out=None, mode='raise')[source] Take
elements from an array along an axis.

Fill up a 2D array while iterating through it

An example what I want to do is instead of doing what is shown below:
Z_old = [[0,0,0,0,0],[0,0,0,0,0],[0,0,0,0,0]]
for each_axes in range(len(Z_old)):
for each_point in range(len(Z_old[each_axes])):
Z_old[len(Z_old)-1-each_axes][each_point] = arbitrary_function(each_point, each_axes)
I want now to not initialize the Z_old array with zeroes but rather fill it up with values while iterating through it which is going to be something like the written below although it's syntax is horribly wrong but that's what I want to reach in the end.
Z = np.zeros((len(x_list), len(y_list))) for Z[len(x_list) -1 - counter_1][counter_2] is equal to power_at_each_point(counter_1, counter_2] for counter_1 in range(len(x_list)) and counter_2 in range(len(y_list))]

As I explained in my answer to your previous question, you really need to vectorize arbitrary_function.
You can do this by just calling np.vectorize on the function, something like this:
Z = np.vectorize(arbitrary_function)(np.arange(3), np.arange(5).reshape(5, 1))
But that will only give you a small speedup. In your case, since arbitrary_function is doing a huge amount of work (including opening and parsing an Excel spreadsheet), it's unlikely to make enough difference to even notice, much less to solve your performance problem.
The whole point of using NumPy for speedups is to find the slow part of the code that operates on one value at a time, and replace it with something that operates on the whole array (or at least a whole row or column) at once. You can't do that by looking at the very outside loop, you need to look at the very inside loop. In other words, at arbitrary_function.
In your case, what you probably want to do is read the Excel spreadsheet into a global array, structured in such a way that each step in your process can be written as an array-wide operation on that array. Whether that means multiplying by a slice of the array, indexing the array using your input values as indices, or something completely different, it has to be something NumPy can do for you in C, or NumPy isn't going to help you.
If you can't figure out how to do that, you may want to consider not using NumPy, and instead compiling your inner loop with Cython, or running your code under PyPy. You'll still almost certainly need to move the "open and parse a whole Excel spreadsheet" outside of the inner loop, but at least you won't have to figure out how to rethink your problem in terms of vectorized operations, so it may be easier for you.

rows = 10
cols = 10
Z = numpy.array([ arbitrary_function(each_point, each_axes) for each_axes in range(cols) for each_point in range(rows) ]).reshape((rows,cols))
maybe?

How can one efficiently remove a range of rows from a large numpy array?

Given a large 2d numpy array, I would like to remove a range of rows, say rows 10000:10010 efficiently. I have to do this multiple times with different ranges, so I would like to also make it parallelizable.
Using something like numpy.delete() is not efficient, since it needs to copy the array, taking too much time and memory. Ideally I would want to do something like create a view, but I am not sure how I could do this in this case. A masked array is also not an option since the downstream operations are not supported on masked arrays.
Any ideas?

Because of the strided data structure that defines a numpy array, what you want will not be possible without using a masked array. Your best option might be to use a masked array (or perhaps your own boolean array) to mask the deleted the rows, and then do a single real delete operation of all the rows to be deleted before passing it downstream.

There isn't really a good way to speed up the delete operation, as you've already alluded to, this kind of deleting requires the data to be copied in memory. The one thing you can do, as suggested by #WarrenWeckesser, is combine multiple delete operations and apply them all at once. Here's an example:
ranges = [(10, 20), (25, 30), (50, 100)]
mask = np.ones(len(array), dtype=bool)
# Update the mask with all the rows you want to delete
for start, end in ranges:
mask[start:stop] = False
# Apply all the changes at once
new_array = array[mask]
It doesn't really make sense to parallelize this because you're just copying stuff in memory so this will be memory bound anyways, adding more cpus will not help.

I don't know how fast this is, relative to the above, but say you have a list L of row indices of the rows you wish to keep from array A (by "rows" I mean the first index, for higher dimensional arrays). All other rows will be deleted. We'll let A hold the result.
A = A[np.ix_(L)]

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.