Y, X = np.mgrid[-3:-3:10j, -3:3:10j]
I've noticed that when applying certain operations on meshgrids like the one above I get an error because the operations may not be compatible with numpy. Sometimes there might be a numpy function alternative for sin, cos but not for all function like the quad function in scipy.integrate.
How do I get around this problem? I need to apply operations on the entire meshgrids.
Your question (with the follow-on comment) can be taken at least two different ways:
You have a function of multiple arguments, and you would like to be able to call that function in a manner that is syntactically similar to the broadcasted calls supported natively by numpy. Performance is not the issue, just the calling syntax of the function.
You have a function of multiple arguments that is to be evaluated on a sequence of numpy arrays, but the function is not implemented in such a manner that it can exploit the contiguous memory layout of numpy arrays. Performance is the issue; you would be happy to loop over the numpy arrays and call your function in a boring, plain old for-loop style, except that doing so is too slow.
For item 1. there is a convenience function provided by numpy called vectorize which takes a regular callable and returns a callable that can be called with numpy arrays as the arguments and will obey numpy's broadcasting rules.
Consider this contrived example:
def my_func(x, y):
return x + 2*y
Now suppose I need to evaluate this function everywhere in a 2-D grid. Here is the plain old boring way:
Y, X = np.mgrid[0:10:1, 0:10:1]
Z = np.zeros_like(Y)
for i in range(Y.shape[0]):
for j in range(Y.shape[1]):
Z[i,j] = my_func(X[i,j], Y[i,j])
If we had a few different functions like my_func, it might be nice to generalize this process into a function that "mapped" a given function over the 2-D arrays.
import itertools
def array_map(some_func, *arg_arrays):
output = np.zeros_like(arg_arrays[0])
coordinates = itertools.imap(range, output.shape)
for coord in itertools.product(coordinates):
args = [arg_array[coord] for arg_array in arg_arrays]
output[coord] = some_func(*args)
return output
Now we can see that array_map(my_func, X, Y) acts just like the nested for-loop:
In [451]: array_map(my_func, X, Y)
Out[451]:
array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[ 2, 3, 4, 5, 6, 7, 8, 9, 10, 11],
[ 4, 5, 6, 7, 8, 9, 10, 11, 12, 13],
[ 6, 7, 8, 9, 10, 11, 12, 13, 14, 15],
[ 8, 9, 10, 11, 12, 13, 14, 15, 16, 17],
[10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
[12, 13, 14, 15, 16, 17, 18, 19, 20, 21],
[14, 15, 16, 17, 18, 19, 20, 21, 22, 23],
[16, 17, 18, 19, 20, 21, 22, 23, 24, 25],
[18, 19, 20, 21, 22, 23, 24, 25, 26, 27]])
Now, wouldn't it be nice if we could call array_map(my_func) and leave off the extra array arguments? Instead just getting back a new function that was just waiting to do the required for-loops.
We can do this with functools.partial -- so we can write a handy little vectorizer like this:
import functools
def vectorizer(regular_function):
awesome_function = functools.partial(array_map, regular_function)
return awesome_function
and testing it out:
In [453]: my_awesome_func = vectorizer(my_func)
In [454]: my_awesome_func(X, Y)
Out[454]:
array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[ 2, 3, 4, 5, 6, 7, 8, 9, 10, 11],
[ 4, 5, 6, 7, 8, 9, 10, 11, 12, 13],
[ 6, 7, 8, 9, 10, 11, 12, 13, 14, 15],
[ 8, 9, 10, 11, 12, 13, 14, 15, 16, 17],
[10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
[12, 13, 14, 15, 16, 17, 18, 19, 20, 21],
[14, 15, 16, 17, 18, 19, 20, 21, 22, 23],
[16, 17, 18, 19, 20, 21, 22, 23, 24, 25],
[18, 19, 20, 21, 22, 23, 24, 25, 26, 27]])
Now my_awesome_func behaves as if you are able to call it directly on top of ndarrays!
I've overlooked many extra little performance details, bounds checking, etc., while making this toy version called vectorizer ... but luckily in numpy there is vectorize which already does just this!
In [455]: my_vectorize_func = np.vectorize(my_func)
In [456]: my_vectorize_func(X, Y)
Out[456]:
array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[ 2, 3, 4, 5, 6, 7, 8, 9, 10, 11],
[ 4, 5, 6, 7, 8, 9, 10, 11, 12, 13],
[ 6, 7, 8, 9, 10, 11, 12, 13, 14, 15],
[ 8, 9, 10, 11, 12, 13, 14, 15, 16, 17],
[10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
[12, 13, 14, 15, 16, 17, 18, 19, 20, 21],
[14, 15, 16, 17, 18, 19, 20, 21, 22, 23],
[16, 17, 18, 19, 20, 21, 22, 23, 24, 25],
[18, 19, 20, 21, 22, 23, 24, 25, 26, 27]])
Once again, as stressed in my earlier comments to the OP and in the documentation for vectorize -- this is not a speed optimization. In fact, the extra function calling overhead will be slower in some cases than just writing a for-loop directly. But, for cases when speed is not a problem, this method does allow you to make your custom functions adhere to the same calling conventions as numpy -- which can improve the uniformity of your library's interface and make the code more consistent and more readable.
A whole lot of other stuff has already been written about item 2. If your problem is that you need to optimize your functions to leverage contiguous blocks of memory and by-passing repeated dynamic type checking (the main features that numpy arrays add to Python lists) then here are a few links you may find helpful:
< http://pandas.pydata.org/pandas-docs/stable/enhancingperf.html >
< http://csl.name/C-functions-from-Python/ >
< https://jakevdp.github.io/blog/2014/05/09/why-python-is-slow >
< nbviewer.ipython.org/url/jakevdp.github.io/downloads/notebooks/NumbaCython.ipynb >
Related
I'm interested in reordering the bits within a number, and since I want to do it several trillion times, I want to do it fast.
Here are the details: given a number num and an order matrix order.
order contains up to ~6000 lines of permutations of the numbers 0..31.
These are the positions to which the bits change.
Simplified example: binary(num) = 1001, order[1]=[0,1,3,2], reordered number for order[1] would be 1010 (binary).
Now I want to know, if my input number num is the smallest of these (~6000) reordered numbers. I'm searching for all 32-Bit numbers which fullfill this criterion.
My current approach is to slow, so I'm looking for a speedup.
minimal-reproducible-example:
num = 1753251840
order = [[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31],
[ 3, 2, 1, 0, 7, 6, 5, 4, 11, 10, 9, 8, 15, 14, 13, 12, 19, 18, 17, 16, 23, 22, 21, 20, 27, 26, 25, 24, 31, 30, 29, 28],
[15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0, 31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18, 17, 16],
[31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0],
[ 0, 1, 2, 3, 4, 5, 6, 7, 16, 17, 18, 19, 20, 21, 22, 23, 8, 9, 10, 11, 12, 13, 14, 15, 24, 25, 26, 27, 28, 29, 30, 31],
[21, 20, 23, 22, 29, 28, 31, 30, 17, 16, 19, 18, 25, 24, 27, 26, 5, 4, 7, 6, 13, 12, 15, 14, 1, 0, 3, 2, 9, 8, 11, 10]]
patterns=set()
bits = format(num, '032b')
for perm in order:
bitsn = [bits[perm[i]] for i in range(32)]
patterns.add(int(''.join(bitsn),2))
print( min(patterns)==num)
Where can I start to improve this?
Extracting bits using string is generally very inefficient (whatever the language). The same thing also apply for parsing. Moreover, for such a fast low-level operation, you need to use a JIT or a compiled language as comments already pointed out.
Here is a prototype using the Numba's JIT (assume all numbers are unsigned):
npOrder = np.array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31],
[ 3, 2, 1, 0, 7, 6, 5, 4, 11, 10, 9, 8, 15, 14, 13, 12, 19, 18, 17, 16, 23, 22, 21, 20, 27, 26, 25, 24, 31, 30, 29, 28],
[15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0, 31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18, 17, 16],
[31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0],
[ 0, 1, 2, 3, 4, 5, 6, 7, 16, 17, 18, 19, 20, 21, 22, 23, 8, 9, 10, 11, 12, 13, 14, 15, 24, 25, 26, 27, 28, 29, 30, 31],
[21, 20, 23, 22, 29, 28, 31, 30, 17, 16, 19, 18, 25, 24, 27, 26, 5, 4, 7, 6, 13, 12, 15, 14, 1, 0, 3, 2, 9, 8, 11, 10]], dtype=np.uint32)
#njit
def extractBits(num):
bits = np.empty(32, dtype=np.int32)
for i in range(32):
bits[i] = (num >> i) & 0x01
return bits
#njit
def permuteAndMerge(bits, perm):
bitsnFinal = 0
for i in range(32):
bitsnFinal |= bits[31-perm[i]] << i
return bitsnFinal
#njit
def computeOptimized(num):
bits = extractBits(num)
permCount = npOrder.shape[0]
patterns = np.empty(permCount, dtype=np.uint32)
for i in range(permCount):
patterns[i] = permuteAndMerge(bits, npOrder[i])
# The array can be converted to a set if needed here with: set(patterns)
return min(patterns) == num
This code is about 25 time faster than the original one on my machine (ran 5 000 000 times).
You can also use Numba to accelerate and parallelize the loop that run the function computeOptimized resulting in a significant additional speed-up.
Note that this code can be again much faster in C or C++ using low-level processor instructions (available for example on many x86_64 processors). With that and parallelism, the order of magnitude of the execution speed should be close to a billion of permutation per second.
Couple of possible speed-ups, staying with Python and the current algorithm:
Bail out as soon as you find a pattern less than num; once one like that is found, the condition cannot possibly be true. (You also don't need to store patterns; at most a flag whether an equal one was found, if that's not guaranteed by the problem.)
bitsn could be a generator expression, and doesn't need to be in a variable; you'll have to measure whether that's faster.
More fundamental improvements:
If you want to find all the numbers (rather than just test a particular one), it feels like there ought to be a faster algorithm by considering what the bits mean. A couple of hours thinking could potentially let you process just the 6000 lists, rather than all 2³² integers.
As others have written, if you're after pure speed, python is not the ideal language. That depends on the balance of how much time you want to spend on programming vs on running the program.
Side note:
Are the 32-bit integers signed or unsigned?
If I have a list:
lst = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24]
I would like to cast the above list into an array with the following arrangements of the elements:
array([[ 1, 2, 3, 7, 8, 9]
[ 4, 5, 6, 10, 11, 12]
[13, 14, 15, 19, 20, 21]
[16, 17, 18, 22, 23, 24]])
How do I do this or what is the best way to do this? Many thanks.
I have done this in a crude way below where I will just get all the sub-matrix and then concatenate all of them at the end:
np.array(results[arr.shape[0]*arr.shape[1]*0:arr.shape[0]*arr.shape[1]*1]).reshape(arr.shape[0], arr.shape[1])
array([[1, 2, 3],
[4, 5, 6]])
np.array(results[arr.shape[0]*arr.shape[1]*1:arr.shape[0]*arr.shape[1]*2]).reshape(arr.shape[0], arr.shape[1])
array([[ 7, 8, 9],
[ 10, 11, 12]])
etc,
But I will need a more generalized way of doing this (if there is one) as I will need to do this for an array of any size.
You could use the reshape function from numpy, with a bit of indexing :
a = np.arange(24)
>>> a
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
17, 18, 19, 20, 21, 22, 23])
Using reshape and a bit of indexing :
a = a.reshape((8,3))
idx = np.arange(2)
idx = np.concatenate((idx,idx+4))
idx = np.ravel([idx,idx+2],'F')
b = a[idx,:].reshape((4,6))
Ouptut :
>>> b
array([[ 0, 1, 2, 6, 7, 8],
[ 3, 4, 5, 9, 10, 11],
[12, 13, 14, 18, 19, 20],
[15, 16, 17, 21, 22, 23]])
Here the tuple (4,6) passed to reshape indicates that you want your array to be 2 dimensional, and have 4 arrays of 6 elements. Those values can be computed.
Then we compute the index to set the correct order of the data. Obvisouly, this a complicated bit here. As I'm not sure what you mean by "any size of data", its difficult for me to give you a agnostic way to compute that index.
Obviously, if you are using a list and not an np.array, you might have to convert the list first, for example by using np.array(your_list).
Edit :
I'm not sure if this exactly what you are after, but this should work for any array evenly divisible by 6 :
def custom_order(size):
a = np.arange(size)
a = a.reshape((size//3,3))
idx = np.arange(2)
idx = np.concatenate([idx+4*i for i in range(0,size//(6*2))])
idx = np.ravel([idx,idx+2],'F')
b = a[idx,:].reshape((size//6,6))
return b
>>> custom_order(48)
array([[ 0, 1, 2, 6, 7, 8],
[ 3, 4, 5, 9, 10, 11],
[12, 13, 14, 18, 19, 20],
[15, 16, 17, 21, 22, 23],
[24, 25, 26, 30, 31, 32],
[27, 28, 29, 33, 34, 35],
[36, 37, 38, 42, 43, 44],
[39, 40, 41, 45, 46, 47]])
I am working in numpy and have a numpy array of the form;
[[ 1, 2, 3],
[ 4, 5, 6],
[ 7, 8, 9],
[10, 11, 12],
[13, 14, 15],
[16, 17, 18],
[19, 20, 21],
[22, 23, 24]]
I want to use only the reshape and transpose functions and obtain the following array:
[[ 1, 2, 3, 7, 8, 9, 13, 14, 15, 19, 20, 21],
[ 4, 5, 6, 10, 11, 12, 16, 17, 18, 22, 23, 24]]
Can this be done? I have spent hours trying and am starting to think it just can't be done - am I missing something obvious?
You can reshape into columns, then transpose, then reshape with something like:
a = np.array([[ 1,2,3],
[ 4,5,6],
[ 7,8,9],
[10, 11, 12],
[13, 14, 15],
[16, 17, 18],
[19, 20, 21],
[22, 23, 24]])
a.reshape(-1, 2, 3).transpose((1, 0, 2)).reshape(2, -1)
# array([[ 1, 2, 3, 7, 8, 9, 13, 14, 15, 19, 20, 21],
# [ 4, 5, 6, 10, 11, 12, 16, 17, 18, 22, 23, 24]])
You may try to slice odd and even and pass them to np.reshape.
a_out = np.reshape([a[::2], a[1::2]], (2,-1))
Out[81]:
array([[ 1, 2, 3, 7, 8, 9, 13, 14, 15, 19, 20, 21],
[ 4, 5, 6, 10, 11, 12, 16, 17, 18, 22, 23, 24]])
Hi I have an array that I want to sum the elements vertically. Just wonder are there any functions can do this easily ?
a = [[ 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10],
[11, 12, 13, 14, 15],
[16, 17, 18, 19, 20],
[21, 22, 23, 24, 25]]
I want to print the answers of 1+6+11+16+21 , 2+7+12+17, 3+8+13, 4+9, 5
As you can see, in each iteration, there is one element less.
This is one approach using zip and a simple iteration.
Ex:
a = [[ 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10],
[11, 12, 13, 14, 15],
[16, 17, 18, 19, 20],
[21, 22, 23, 24, 25]]
print([sum(v[:-i]) if i else sum(v) for i, v in enumerate(zip(*a))])
Output:
[55, 38, 24, 13, 5]
Converting to a numpy array, and then using the following list comprehension
a = np.array(a)
[a[:5-i,i].sum() for i in range(5)]
yields the following:
[55, 38, 24, 13, 5]
I have two arrays and i want to check how many integers are the same in the different arrays. The problem i'm having is that it only shows me how many are the same when they are in the same position. Both arrays have 15 numbers in them.
Example:
import numpy as np
a = np.array([1, 4, 5, 7, 9, 14, 15, 17, 18, 19, 21, 22, 23, 25, 26])
b = np.array([8, 28, 12, 3, 24, 16, 23, 19, 14, 2, 11, 29, 27, 6, 13])
print(np.count_nonzero(a==b))
This prints 0 even though there's clearly integers that are the same. How can i make this print how many integers have the same value?
You want to use np.intersect1d, if I am understanding you correctly:
In [12]: import numpy as np
In [13]: a = np.array([1, 4, 5, 7, 9, 14, 15, 17, 18, 19, 21, 22, 23, 25, 26])
...: b = np.array([8, 28, 12, 3, 24, 16, 23, 19, 14, 2, 11, 29, 27, 6, 13])
...:
In [14]: np.intersect1d(a, b)
Out[14]: array([14, 19, 23])
You can perform broadcasted comparison between b and a, and then just tally up the matches:
(b == a[:, None]).sum()
3
This checks out since you have [14, 19, 23] as the common elements.