How to determine if numpy.vectorize() is needed? - python

Is there a way to determine at runtime if a function requires numpy.vectorize() to behave as expected?
For background, I ask this because I'm using Numpy in a program to calculate phase diagrams from thermodynamic functions available in the literature (CALPHAD-based). For a given temperature, one evaluates the free energy functions and determines the common tangent curves touching concave-up (second derivative > 0) to define composition ranges of phase coexistence. For this, it was nice to directly define the second derivative function. All was going well with real free energy functions (not hard to get derivatives of) until I tried to test with a simple parabolic free enrgy, which has a constant second derivative. This blew up my algorithm since I had not expected the numpy broadcasting to look inside the function and decide it did not need to broadcast.
The difficulty comes down to this behavior:
import numpy as np
def f(x):
return(x*x)
def g(x):
return(3.0)
def h(x):
return(0*x+3.0)
def i(x):
return(x-x+3.0)
x = np.linspace(1.0, 5.0, 5)
Running in IPython 3.3.2 results in these outputs:
f(x) ->
array([ 1., 4., 9., 16., 25.]) -- what I expected
g(x) ->
3.0 (note only 1 element, and a float, not ndarray) -- not naively expected
h(x) ->
array([ 3., 3., 3., 3., 3.]) -- OK, fooled the broadcasting by having x do something
i(x) ->
array([ 3., 3., 3., 3., 3.]) -- Same as h(x) but avoiding the multiply but with roundoff issues
Now I could use
gv = np.vectorize(g)
and get
gv(x) -> array([ 3., 3., 3., 3., 3.]) -- expected behavior
If my program is to (eventually) accept arbitrary user-entered free energy functions this will cause problems unless all users understand numpy internal broadcasting magics.
Or, I could reflexively np.vectorize everything to prevent this. The problem is the cost if the function will "just work" in numpy.
That is, using %timeit in IPython,
h(x) -> 100000 loops, best of 3: 3.45 µs per loop
If I vectorize h(x) needlessly (i.e. hv = np.vectorize(h)), I get
hv(x) -> 10000 loops, best of 3: 43.2 µs per loop
So, needlessly vectorizing is a huge penalty (40 microseconds for 5 function evals).
I guess I could to an initial test on the return of a function evaluating on a small ndarray to see if the return type is array or float, and then define a new function if it is float, like:
def gv(x):
return(g(x)+0.0*x)
That just seems like a horrible kludge.
So - is there a better way to 'fool' numpy into efficiently broadcasting in this case?

To solve the problem shown. If you want a new array:
def g(x):
return np.ones_like(x)*3
or if you want to set all elements in an array to 3 in place:
def g(x):
x[:] = 3
Note there is no return statement here as you are simply updating array x so that all elements are 3.
The issue with def g(x): return(3) as shown is there is no reference to numpy inside the function. You state for any given input return 3. Stating x=3 will run into similar issues as you are updating the pointer x to point to 3 instead of the numpy array. While the statement x[:]=3 accesses an internal function known as a view from the numpy.ndarray class instead of the usual use of the = statement that simply updates a pointer.

As others have suggested, you could wrap the user-provided functions to make sure the output shape is correct. For example:
def wrap_user_function(func, x):
out = func(x)
if np.isscalar(out):
return np.zeros_like(x) + out
return out
This only handles the scalar output case specially, but it should at least take care of your g(x) issue, without imposing much of a performance hit.

Related

Vectorize list returning python function into numpy nd-array [duplicate]

numpy.vectorize takes a function f:a->b and turns it into g:a[]->b[].
This works fine when a and b are scalars, but I can't think of a reason why it wouldn't work with b as an ndarray or list, i.e. f:a->b[] and g:a[]->b[][]
For example:
import numpy as np
def f(x):
return x * np.array([1,1,1,1,1], dtype=np.float32)
g = np.vectorize(f, otypes=[np.ndarray])
a = np.arange(4)
print(g(a))
This yields:
array([[ 0. 0. 0. 0. 0.],
[ 1. 1. 1. 1. 1.],
[ 2. 2. 2. 2. 2.],
[ 3. 3. 3. 3. 3.]], dtype=object)
Ok, so that gives the right values, but the wrong dtype. And even worse:
g(a).shape
yields:
(4,)
So this array is pretty much useless. I know I can convert it doing:
np.array(map(list, a), dtype=np.float32)
to give me what I want:
array([[ 0., 0., 0., 0., 0.],
[ 1., 1., 1., 1., 1.],
[ 2., 2., 2., 2., 2.],
[ 3., 3., 3., 3., 3.]], dtype=float32)
but that is neither efficient nor pythonic. Can any of you guys find a cleaner way to do this?
np.vectorize is just a convenience function. It doesn't actually make code run any faster. If it isn't convenient to use np.vectorize, simply write your own function that works as you wish.
The purpose of np.vectorize is to transform functions which are not numpy-aware (e.g. take floats as input and return floats as output) into functions that can operate on (and return) numpy arrays.
Your function f is already numpy-aware -- it uses a numpy array in its definition and returns a numpy array. So np.vectorize is not a good fit for your use case.
The solution therefore is just to roll your own function f that works the way you desire.
A new parameter signature in 1.12.0 does exactly what you what.
def f(x):
return x * np.array([1,1,1,1,1], dtype=np.float32)
g = np.vectorize(f, signature='()->(n)')
Then g(np.arange(4)).shape will give (4L, 5L).
Here the signature of f is specified. The (n) is the shape of the return value, and the () is the shape of the parameter which is scalar. And the parameters can be arrays too. For more complex signatures, see Generalized Universal Function API.
import numpy as np
def f(x):
return x * np.array([1,1,1,1,1], dtype=np.float32)
g = np.vectorize(f, otypes=[np.ndarray])
a = np.arange(4)
b = g(a)
b = np.array(b.tolist())
print(b)#b.shape = (4,5)
c = np.ones((2,3,4))
d = g(c)
d = np.array(d.tolist())
print(d)#d.shape = (2,3,4,5)
This should fix the problem and it will work regardless of what size your input is. "map" only works for one dimentional inputs. Using ".tolist()" and creating a new ndarray solves the problem more completely and nicely(I believe). Hope this helps.
You want to vectorize the function
import numpy as np
def f(x):
return x * np.array([1,1,1,1,1], dtype=np.float32)
Assuming that you want to get single np.float32 arrays as result, you have to specify this as otype. In your question you specified however otypes=[np.ndarray] which means you want every element to be an np.ndarray. Thus, you correctly get a result of dtype=object.
The correct call would be
np.vectorize(f, signature='()->(n)', otypes=[np.float32])
For such a simple function it is however better to leverage numpy's ufunctions; np.vectorize just loops over it. So in your case just rewrite your function as
def f(x):
return np.multiply.outer(x, np.array([1,1,1,1,1], dtype=np.float32))
This is faster and produces less obscure errors (note however, that the results dtype will depend on x if you pass a complex or quad precision number, so will be the result).
I've written a function, it seems fits to your need.
def amap(func, *args):
'''array version of build-in map
amap(function, sequence[, sequence, ...]) -> array
Examples
--------
>>> amap(lambda x: x**2, 1)
array(1)
>>> amap(lambda x: x**2, [1, 2])
array([1, 4])
>>> amap(lambda x,y: y**2 + x**2, 1, [1, 2])
array([2, 5])
>>> amap(lambda x: (x, x), 1)
array([1, 1])
>>> amap(lambda x,y: [x**2, y**2], [1,2], [3,4])
array([[1, 9], [4, 16]])
'''
args = np.broadcast(None, *args)
res = np.array([func(*arg[1:]) for arg in args])
shape = args.shape + res.shape[1:]
return res.reshape(shape)
Let try
def f(x):
return x * np.array([1,1,1,1,1], dtype=np.float32)
amap(f, np.arange(4))
Outputs
array([[ 0., 0., 0., 0., 0.],
[ 1., 1., 1., 1., 1.],
[ 2., 2., 2., 2., 2.],
[ 3., 3., 3., 3., 3.]], dtype=float32)
You may also wrap it with lambda or partial for convenience
g = lambda x:amap(f, x)
g(np.arange(4))
Note the docstring of vectorize says
The vectorize function is provided primarily for convenience, not for
performance. The implementation is essentially a for loop.
Thus we would expect the amap here have similar performance as vectorize. I didn't check it, Any performance test are welcome.
If the performance is really important, you should consider something else, e.g. direct array calculation with reshape and broadcast to avoid loop in pure python (both vectorize and amap are the later case).
The best way to solve this would be to use a 2-D NumPy array (in this case a column array) as an input to the original function, which will then generate a 2-D output with the results I believe you were expecting.
Here is what it might look like in code:
import numpy as np
def f(x):
return x*np.array([1, 1, 1, 1, 1], dtype=np.float32)
a = np.arange(4).reshape((4, 1))
b = f(a)
# b is a 2-D array with shape (4, 5)
print(b)
This is a much simpler and less error prone way to complete the operation. Rather than trying to transform the function with numpy.vectorize, this method relies on NumPy's natural ability to broadcast arrays. The trick is to make sure that at least one dimension has an equal length between the arrays.

An efficient way to iterate over a multidimensional array?

I'm trying to find a way to perform operations on each elements across multiple 2D-arrays without having to loop over them. Or at least, not needing two for loops. My code calculates the standard deviation of each pixel over a series of images (arrays). Now, the amount of images there are is not the problem, it is the size of the arrays, making the code take extremely slow. The following is a working example of what I have.
import numpy as np
# reshape(# of image (arrays),# of rows, # of cols)
a = np.arange(32).reshape(2,4,4)
stddev_arr = np.array([])
for i in range(4):
for j in range(4):
pixel = a[0:,i,j]
stddev = np.std(pixel)
stddev_arr = np.append(stddev_arr, stddev)
My actual data is 2000x2000, making this code loop 4000000 times. Is there a better way to do this?
Any advice is extremely appreciated.
You're already using numpy. numpy's std() function takes an axis argument that tells it what axis you want it to operate on (in this case the zeroth axis). Because this offloads the calculation to numpy's C-backend (and possibly using SIMD optimizations for your processor that vectorize a lot of operations), it's so much faster than iterating. Another time-consuming operation in your code is when you append to stddev_arr. Appending to numpy arrays is slow because the entire array is copied into new memory before the new element is added. Now you already know how big that array needs to be, so you might as well preallocate it.
a = np.arange(32).reshape(2, 4, 4)
stdev = np.std(a, axis=0)
This gives a 4x4 array
array([[8., 8., 8., 8.],
[8., 8., 8., 8.],
[8., 8., 8., 8.],
[8., 8., 8., 8.]])
To flatten this into a 1D array, do flat_stdev = stdev.flatten().
Comparing the execution times:
# Using only numpy
def fun1(arr):
return np.std(arr, axis=0).flatten()
# Your function
def fun2(arr):
stddev_arr = np.array([])
for i in range(arr.shape[1]):
for j in range(arr.shape[2]):
pixel = arr[0:,i,j]
stddev = np.std(pixel)
stddev_arr = np.append(stddev_arr, stddev)
return stddev_arr
# Your function, but pre-allocating stddev_arr
def fun3(arr):
stddev_arr = np.zeros((arr.shape[1] * arr.shape[2],))
x = 0
for i in range(arr.shape[1]):
for j in range(arr.shape[2]):
pixel = arr[0:,i,j]
stddev = np.std(pixel)
stddev_arr[x] = stddev
x += 1
return stddev_arr
First, let's make sure all these functions are equivalent:
a = np.random.random((3, 10, 10))
assert np.all(fun1(a) == fun2(a))
assert np.all(fun1(a) == fun3(a))
Yup, all give the same result. Now, let's try with a bigger array.
a = np.random.random((3, 100, 100))
x = timeit.timeit('fun1(a)', setup='from __main__ import fun1, a', number=10)
# x: 0.003302899989648722
y = timeit.timeit('fun2(a)', setup='from __main__ import fun2, a', number=10)
# y: 5.495519500007504
z = timeit.timeit('fun3(a)', setup='from __main__ import fun3, a', number=10)
# z: 3.6250679999939166
Wow! We get a ~1.5x speedup just by preallocating.
Even more wow: using numpy's std() with the axis argument gives a > 1000x speedup, and this is just for the 100x100 array! With bigger arrays, you can expect to see even bigger speedup.
So based on what you have provided, you can reshape your array in another way to vectorize it to replace your two loops. Then you only have to use np.std once on the axis that you want.
a = np.arange(32).reshape(2, 4, 4)
a = a.reshape(2, -1).transpose()
stddev_arr = np.std(a, axis=1)

weighted moving average with numpy.convolve

I'm writing a moving average function that uses the convolve function in numpy, which should be equivalent to a (weighted moving average). When my weights are all equal (as in a simple arithmatic average), it works fine:
data = numpy.arange(1,11)
numdays = 5
w = [1.0/numdays]*numdays
numpy.convolve(data,w,'valid')
gives
array([ 3., 4., 5., 6., 7., 8.])
However, when I try to use a weighted average
w = numpy.cumsum(numpy.ones(numdays,dtype=float),axis=0); w = w/numpy.sum(w)
instead of the (for the same data) 3.667,4.667,5.667,6.667,... I expect, I get
array([ 2.33333333, 3.33333333, 4.33333333, 5.33333333, 6.33333333,
7.33333333])
If I remove the 'valid' flag, I don't even see the correct values. I would really like to use convolve for the WMA as well as MA as it makes the code cleaner (same code, different weights) and otherwise I think I'll have to loop through all the data and take slices.
Any ideas about this behavior?
What you want is np.correlate in a convolution the second argument is inverted basically, so that your expected result would be with np.convolve(data, w[::-1], 'valid').

Difference in FFT between IDL and Python

I'm passing some simple IDL code to Python. However the returned FFT values form the SciPy/NumPy packages is different than the IDL one and I can't find out why.
Reducing it all to a simple example of 8 elements I found that the SciPy/NumPy routines return values that are 8 (2^3) times bigger than the IDL ones (a normalization problem I thought).
Here is the example code (copied from here) in both languages:
IDL
signal = ([-2., 8., -6., 4., 1., 0., 3., 5.])
fourier = fft(signal)
print, fourier
returns
( 1.62500, 0.00000) ( 0.420495, 0.506282) ( 0.250000, 0.125000) ( -1.17050, -1.74372) ( -2.62500, -0.00000) ( -1.17050, 1.74372) ( 0.250000, -0.125000) ( 0.420495, -0.506282)
Python
from scipy.fftpack import fft
import numpy as N
…
signal = N.array([-2., 8., -6., 4., 1., 0., 3., 5.])
fourier = fft(signal)
print fourier
returns
[ 13. +0.j , 3.36396103 +4.05025253j, 2. +1.j , -9.36396103-13.94974747j, -21. +0.j , -9.36396103+13.94974747j, 2. -1.j , 3.36396103 -4.05025253j]
I did it with the NumPy package and I got the same results. I tried also print fft(signal, 8 ) just in case but it returned the same, as expected.
However that's not all, coming back to my real array of 256 elements I found that the difference was no longer 8 or 256, but 256*8! it's just insane.
Although I worked around the problem I NEED to know why there is that difference.
Solved: It was just the normalization, at some point I divided the IDL 256 array by a factor of 8 that I forgot to remove. In Dougal's answer there is the documentation that I missed.
IDL and numpy use slightly different definitions of the DFT. Numpy's is (from the documentation):
(source: scipy.org)
while IDL's is (from here):
Numpy's m is the same as IDL's x, k is u, n is N. I think a_m and f(x) are the same thing as well. So the factor of 1/N is the obvious difference, explaining the difference of 8 in your 8-elt case.
I'm not sure about the 256*8 one for the 256-elt case; could you maybe post the original array and both outputs somewhere? (Does this happen for all 256-elt arrays? What about other sizes? I don't have IDL....)

Numpy array broadcasting with vector parameters

Is it possible to do array broadcasting in numpy with parameters that are vectors?
For example, I know that I can do this
def bernoulli_fraction_to_logodds(fraction):
if fraction == 1.0:
return inf
return log(fraction / (1 - fraction))
bernoulli_fraction_to_logodds = numpy.frompyfunc(bernoulli_fraction_to_logodds, 1, 1)
and have it work with the whole array. What if I have a function that take a 2-element vector and returns a 2-element vector. Can I pass it an array of 2-element vectors? E.g.,
def beta_ml_fraction(beta):
a = beta[0]
b = beta[1]
return a / (a + b)
beta_ml_fraction = numpy.frompyfunc(beta_ml_fraction, 1, 1)
Unfortunately, this doesn't work. Is there a similar function to from_py_func that works. I can hack around this when they are 2-element vectors, but what about when they are n-element vectors?
Thus, input of (2,3) should give 0.4, but input of [[2,3], [3,3]] should give [0.4, 0.5].
I don't think frompyfunc can do this, though I could be wrong.
Regarding np.vectorize A. M. Archibald wrote:
In fact, anything that goes through
python code for the "combine two
scalars" will be slow. The slowness of
looping in python is not because
python's looping constructs are slow,
it's because executing python code is
slow. So vectorize is kind of a cheat
- it doesn't actually run fast, but it is convenient.
So np.frompyfunc (and np.vectorize) are just syntactic sugar -- they don't make Python functions run any faster.
After realizing that, my interest in frompyfunc flagged (to near zero).
There is nothing unreadable about a Python loop, so either use one explicitly,
or rewrite the function to truly leverage numpy (by writing truly vectorized equations).
import numpy as np
def beta_ml_fraction(beta):
a = beta[:,0]
b = beta[:,1]
return a / (a + b)
arr=np.array([(2,3)],dtype=np.float)
print(beta_ml_fraction(arr))
# [ 0.4]
arr=np.array([(2,3),(3,3)],dtype=np.float)
print(beta_ml_fraction(arr))
# [ 0.4 0.5]
When dealing with bidimensional vector array I like to keep the x and y components as the first index. For this I make heavy use of the transpose()
def beta_ml_fraction(beta):
a = beta[0]
b = beta[1]
return a / (a + b)
arr=np.array([(2,3),(3,3)],dtype=np.float)
print(beta_ml_fraction(arr.transpose()))
# [ 0.4 0.5]
the advantage of this approach is that handling multidimensional array of bi-dimensional vector becomes easiear.
x = np.arange(18,dtype=np.float).reshape(2,3,3)
print(x)
#array([[[ 0., 1., 2.],
# [ 3., 4., 5.],
# [ 6., 7., 8.]],
#
# [[ 9., 10., 11.],
# [ 12., 13., 14.],
# [ 15., 16., 17.]]])
print(beta_ml_fraction(x))
#array([[ 0. , 0.09090909, 0.15384615],
# [ 0.2 , 0.23529412, 0.26315789],
# [ 0.28571429, 0.30434783, 0.32 ]])

Categories

Resources