I have a function f, for exapmle:
def f(x):
return x**2
and want to obtain an array consisting of f evaluated over an interval, for example the unit interval (0,1). We ca do this as follows:
import numpy as np
X = np.arange(0,1,0.01)
arr = np.array(list(map(f, X)))
However, this last line is very time consuming when the function is complicated (in my case it involves some integrals). Is there a way to do this faster? I am happy to have a non-elegant solution - the focus is on speed.
You could use list comprehension to slightly decrease runtime.
arr = [f(x) for x in range(0, 5)] # range is the interval
This should work. It will only slightly decrease runtime though. You shouldn't be worried about runtime unless you use very large numbers with map().
If f is so complicated that it can't be expressed in terms of compiled array operations, and can only take scalars, I have found that frompyfunc gives the best performance (about 2x compared to an explicit loop)
In [76]: def f(x):
...: return x**2
...:
In [77]: foo = np.frompyfunc(f,1,1)
In [78]: foo(np.arange(4))
Out[78]: array([0, 1, 4, 9], dtype=object)
In [79]: foo(np.arange(4)).astype(int)
Out[79]: array([0, 1, 4, 9])
It returns dtype object, so needs an astype. np.vectorize uses this as well, but is a bit slower. Both generalize to various shapes of input array(s).
For a 1d result fromiter works with map (without the list) part:
In [84]: np.fromiter((f(x) for x in range(4)),int)
Out[84]: array([0, 1, 4, 9])
In [86]: np.fromiter(map(f, range(4)),int)
Out[86]: array([0, 1, 4, 9])
You'll have to do your own timings in a realistic case.
Use operations that operate on entire arrays. For example, with a function that just squares the input (slightly corrected from your example):
def f(x):
return x**2
then you'd just do
arr = f(X)
because NumPy defines operators like ** to operate on entire arrays at once.
Your real function might not be quite as straightforward. You say there are integrals involved; to make whole-array operations work with that, you might have to pass arguments differently or change what you're using to compute the integrals. In general, though, whole-array operations will vastly outperform anything that needs to call Python-level code in a loop.
You could try numpy.vectorize. It's very good way to apply function to list or array
import numpy as np
def foo(x):
return x**2
foo = np.vectorize(foo)
arr = np.arange(10)
In [1]: foo(arr)
Out[1]: array([ 0, 1, 4, 9, 16, 25, 36, 49, 64, 81])
Related
I've found myself using NumPy arrays for memory management and speed of computation more and more lately, on large volumes of structured data (such as points and polygons). In doing so, there is always a situation where I need to perform some function f(x) on the entire array. From experience, and Googling, iterating over the array is not the way to do this, so insted a function should be vectorized and broadcast to the entire array.
Looking at the documentation for numpy.vectorize we get this example:
def myfunc(a, b):
"Return a-b if a>b, otherwise return a+b"
if a > b:
return a - b
else:
return a + b
>>> vfunc = np.vectorize(myfunc)
>>> vfunc([1, 2, 3, 4], 2)
array([3, 4, 1, 2])
And per the docs it really just creates a for loop so it doesnt access the lower C loops for truly vectorized operations (either in BLAS or SIMD). So that got me wondering, if the above is "vectorized", what is this?:
def myfunc_2(a, b):
cond = a > b
a[cond] -= b
a[~cond] += b
return a
>>> myfunc_2(np.array([1, 2, 3, 4], 2))
array([3, 4, 1, 2])
Or even this:
>>> a = np.array([1, 2, 3, 4]
>>> b = 2
>>> np.where(a > b, a - b, a + b)
array([3, 4, 1, 2])
So I ran some tests on these, what I believe to be comparable examples:
>>> arr = np.random.randint(200, size=(1000000,))
>>> setup = 'from __main__ import vfunc, arr'
>>> timeit('vfunc(arr, 50)', setup=setup, number=1)
0.60175449999997
>>> arr = np.random.randint(200, size=(1000000,))
>>> setup = 'from __main__ import myfunc_2, arr'
>>> timeit('myfunc_2(arr, 50)', setup=setup, number=1)
0.07464979999997468
>>> arr = np.random.randint(200, size=(1000000,))
>>> setup = 'from __main__ import myfunc_3, arr'
>>> timeit('myfunc_3(arr, 50)', setup=setup, number=1)
0.0222587000000658
And with larger run windows:
>>> arr = np.random.randint(200, size=(1000000,))
>>> setup = 'from __main__ import vfunc, arr'
>>> timeit('vfunc(arr, 50)', setup=setup, number=1000)
621.5853878000003
>>> arr = np.random.randint(200, size=(1000000,))
>>> setup = 'from __main__ import myfunc_2, arr'
>>> timeit('myfunc_2(arr, 50)', setup=setup, number=1000)
98.19819199999984
>>> arr = np.random.randint(200, size=(1000000,))
>>> setup = 'from __main__ import myfunc_3, arr'
>>> timeit('myfunc_3(arr, 50)', setup=setup, number=1000)
26.128515100000186
Clearly the other options are major improvements over using numpy.vectorize. This leads me to wonder several things about why anybody would use numpy.vectorize at all if you can write what appear to be "purely vectorized" functions or use battery provided functions like numpy.where.
Now for the questions:
What are the requirements to say a function is "vectorized" if not converted via numpy.vectorize? Just broadcastable in its entirety?
How does NumPy determine if a function is "vectorized"/broadcastable?
Why isn't this form of vectorization documented anywhere? (i.e., why doesn't NumPy have a "How to write a vectorized function" page?
"vectorization" can mean be different things depending on context. Use of low level C code with BLAS or SIMD is just one.
In physics 101, a vector represents a point or velocity whose numeric representation can vary with coordinate system. Thus I think of "vectorization", broadly speaking, as performing math operations on the "whole" array, without explicit control over numerical elements.
numpy basically adds a ndarray class to python. It has a large number of methods (and operators and ufunc) that do indexing and math in compiled code (not necessarily using processor specific SIMD). The big gain in speed, relative to Python level iteration, is the use of compiled code optimized for the ndarray data structure. Python level iteration (interpreted code) on arrays is actually slower than on equivalent lists.
I don't think numpy formally defines "vectorization". There isn't a "vector" class. I haven't searched the documentation for those terms. Here, and possibly on other forums, it just means, writing code that makes optimal use of ndarray methods. Generally that means avoiding python level iteration.
np.vectorize is a tool for applying arrays to functions that only accept scalar inputs. It doesn't not compile or otherwise "look inside" that function. But it does accept and apply arguments in a fully broadcasted sense, such as in:
In [162]: vfunc(np.arange(3)[:,None],np.arange(4))
Out[162]:
array([[0, 1, 2, 3],
[1, 2, 3, 4],
[2, 1, 4, 5]])
Speedwise np.vectorize is slower than the equivalent list comprehension, at least for smaller sample cases. Recent testing shows that it scales better, so for large inputs it may be better. But still the performance is nothing like your myfunc_2.
myfunc is not "vectorized" simply because expressions like if a > b do not work with arrays.
np.where(a > b, a - b, a + b) is "vectorized" because all arguments to the where work with arrays, and where itself uses them with full broadcasting powers.
In [163]: a,b = np.arange(3)[:,None], np.arange(4)
In [164]: np.where(a>b, a-b, a+b)
Out[164]:
array([[0, 1, 2, 3],
[1, 2, 3, 4],
[2, 1, 4, 5]])
myfunc_2 is "vectorized", at least in a:
In [168]: myfunc_2(a,2)
Out[168]:
array([[4],
[1],
[2]])
It does not work when b is array; it's trickier to match the a[cond] shape with anything but a scalar:
In [169]: myfunc_2(a,b)
Traceback (most recent call last):
Input In [169] in <cell line: 1>
myfunc_2(a,b)
Input In [159] in myfunc_2
a[cond] -= b
IndexError: boolean index did not match indexed array along dimension 1; dimension is 1 but corresponding boolean dimension is 4
===
What are the requirements to say a function is "vectorized" if not converted via numpy.vectorize? Just broadcastable in its entirety?
In your examples, my_func is not "vectorized" because it only works with scalars. vfunc is full "vectorized", but not faster. where is also "vectorized" and (probably) faster, though this may be scale dependent. my_func2 is only "vectorized" in a.
How does NumPy determine if a function is "vectorized"/broadcastable?
numpy doesn't determine anything like this. numpy is a ndarray class with many methods. It's just the use of those methods that makes a block of code "vectorized".
Why isn't this form of vectorization documented anywhere? (i.e., why doesn't NumPy have a "How to write a vectorized function" page?
Keep in mind the distinction between "vectorization" as a performance strategy, and the basic idea of operating on whole arrays.
Vectorize Documentation
The documentation provides a great example in def mypolyval(p, x):: there's no good way to write that as a where condition or using simple logic.
def mypolyval(p, x):
_p = list(p)
res = _p.pop(0)
while _p:
res = res*x + _p.pop(0)
return res
vpolyval = np.vectorize(mypolyval, excluded=['p'])
vpolyval(p=[1, 2, 3], x=[0, 1])
array([3, 6])
That is, np.vectorize is clearly what the reference documentation states: convenience to write code in the same fashion even without the performance benefits.
And as for the documentation telling you how to write vectorized code, it does though in the relevant documentation. It says in the documentation what you mentioned above:
The vectorize function is provided primarily for convenience, not for performance. The implementation is essentially a for loop.
Remember: the documentation is an API reference guide with some additional caveats: it's not a NumPy tutorial.
UFunc Documentation
The appropriate reference documentation and glossary document this clearly:
A universal function (or ufunc for short) is a function that operates on ndarrays in an element-by-element fashion, supporting array broadcasting, type casting, and several other standard features. That is, a ufunc is a “vectorized” wrapper for a function that takes a fixed number of specific inputs and produces a fixed number of specific outputs. For detailed information on universal functions, see Universal functions (ufunc) basics.
NumPy hands off array processing to C, where looping and computation are much faster than in Python. To exploit this, programmers using NumPy eliminate Python loops in favor of array-to-array operations. vectorization can refer both to the C offloading and to structuring NumPy code to leverage it.
Summary
Simply put, np.vectorize is for code legibility so you can write similar code to actually vectorized ufuncs. It is not for performance, but there are times when you have no good alternative.
I want to implement the following Matlab code in Python:
x=1:100;
y=20*log10(x);
I tried using Numpy to do this:
y = numpy.zeros(x.shape)
for i in range(len(x)):
y[i] = 20*math.log10(x[i])
But this uses a for loop; is there anyway to do a vectorized operation like in Matlab? I know for some simple math such as division and multiplication, it's possible. But what about other more sophisticated operations like logarithm here?
y = numpy.log10(numpy.arange(1, 101)) * 20
In [30]: numpy.arange(1, 10)
Out[30]: array([1, 2, 3, 4, 5, 6, 7, 8, 9])
In [31]: numpy.log10(numpy.arange(1, 10))
Out[31]:
array([ 0. , 0.30103 , 0.47712125, 0.60205999, 0.69897 ,
0.77815125, 0.84509804, 0.90308999, 0.95424251])
In [32]: numpy.log10(numpy.arange(1, 10)) * 20
Out[32]:
array([ 0. , 6.02059991, 9.54242509, 12.04119983,
13.97940009, 15.56302501, 16.9019608 , 18.06179974, 19.08485019])
Yep, there certainly is.
x = numpy.arange(1, 100)
y = 20 * numpy.log10(x)
Numpy has a lot of built-in array operators like log10. If it's not listed in numpy's documentation and you can't generate it from combining built-in methods, then there's no easy way to do it efficiently. You can implement a C-level function to work on numpy arrays and compile that, but this is a lot more work than one or two lines of Python code.
For your case you almost have the right output already:
y = 20*numpy.log10(x)
You may want to take a look at the Numpy documentation. This is a good place to start:
http://docs.scipy.org/doc/numpy/reference/routines.html
And specifically related to your question:
http://docs.scipy.org/doc/numpy/reference/routines.math.html
If you're not trying to do anything complicated, the original code could be implemented this way as well, without requiring the use of numpy, if I'm not mistaken.
>>> import math
>>> x = range(1, 101)
>>> y = [ 20 * math.log10(z) for z in x ]
Apart from performing vectorized operation using numpy standard vectorized functions, you can also make your custom vectorized function using numpy.vectorize. Here is one example:
>>> def myfunc(a, b):
... "Return a-b if a>b, otherwise return a+b"
... if a > b:
... return a - b
... else:
... return a + b
>>>
>>> vfunc = np.vectorize(myfunc)
>>> vfunc([1, 2, 3, 4], 2)
array([3, 4, 1, 2])
As mentioned in documentation, unlike numpy's standard vectorized functions, this won't improve the performance
Is it possible to construct a numpy matrix from a function? In this case specifically the function is the absolute difference of two vectors: S[i,j] = abs(A[i] - B[j]). A minimal working example that uses regular python:
import numpy as np
A = np.array([1,3,6])
B = np.array([2,4,6])
S = np.zeros((3,3))
for i,x in enumerate(A):
for j,y in enumerate(B):
S[i,j] = abs(x-y)
Giving:
[[ 1. 3. 5.]
[ 1. 1. 3.]
[ 4. 2. 0.]]
It would be nice to have a construction that looks something like:
def build_matrix(shape, input_function, *args)
where I can pass an input function with it's arguments and retain the speed advantage of numpy.
In addition to what #JoshAdel has suggested, you can also use the outer method of any numpy ufunc to do the broadcasting in the case of two arrays.
In this case, you just want np.subtract.outer(A, B) (Or, rather, the absolute value of it).
While either one is fairly readable for this example, in some cases broadcasting is more useful, while in others using ufunc methods is cleaner.
Either way, it's useful to know both tricks.
E.g.
import numpy as np
A = np.array([1,3,6])
B = np.array([2,4,6])
diff = np.subtract.outer(A, B)
result = np.abs(diff)
Basically, you can use outer, accumulate, reduce, and reduceat with any numpy ufunc such as subtract, multiply, divide, or even things like logical_and, etc.
For example, np.cumsum is equivalent to np.add.accumulate. This means you could implement something like a cumdiv by np.divide.accumulate if you even needed to.
I recommend taking a look into numpy's broadcasting capabilities:
In [6]: np.abs(A[:,np.newaxis] - B)
Out[6]:
array([[1, 3, 5],
[1, 1, 3],
[4, 2, 0]])
http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html
Then you could simply write your function as:
In [7]: def build_matrix(func,args):
...: return func(*args)
...:
In [8]: def f1(A,B):
...: return np.abs(A[:,np.newaxis] - B)
...:
In [9]: build_matrix(f1,(A,B))
Out[9]:
array([[1, 3, 5],
[1, 1, 3],
[4, 2, 0]])
This should also be considerably faster than your solution for larger arrays.
What is the best way to improve this code:
def my_func(x, y):
... do smth ...
return cmp(x',y')
my_list = range(0, N)
my_list.sort(cmp=my_func)
A python's list takes a lot of memory in comparison with numpy array (6800MB vs 700MB),
but nympy.array doesn't have the sort function with cmp argument.
Are there other ways to improve memory usage or sort numpy's array with my cmp function?
Update: my current solution is a C function (shared with SWIG) that sorts a huge array of integers and returns it to python after sorting.
But I hope that there is some way to implement memory efficient sorting of huge datasets with Python. Any ideas?
If you can write a ufunc that convert your array, you can do fast sort by argsort:
b = convert(a)
idx = np.argsort(b)
sort_a = a[idx]
As an alternative you can use the builtin sorted with an numpy array:
>>> a = np.arange(10, 1, -1)
>>> sorted(a, cmp=lambda a,b: cmp(a,b))
[2, 3, 4, 5, 6, 7, 8, 9, 10]
It is not in-place, so you have 1400 MB compared to 6800 MB.
What is the difference between vectorize and frompyfunc in numpy?
Both seem very similar. What is a typical use case for each of them?
Edit: As JoshAdel indicates, the class vectorize seems to be built upon frompyfunc. (see the source). It is still unclear to me whether frompyfunc may have any use case that is not covered by vectorize...
As JoshAdel points out, vectorize wraps frompyfunc. Vectorize adds extra features:
Copies the docstring from the original function
Allows you to exclude an argument from broadcasting rules.
Returns an array of the correct dtype instead of dtype=object
Edit: After some brief benchmarking, I find that vectorize is significantly slower (~50%) than frompyfunc for large arrays. If performance is critical in your application, benchmark your use-case first.
`
>>> a = numpy.indices((3,3)).sum(0)
>>> print a, a.dtype
[[0 1 2]
[1 2 3]
[2 3 4]] int32
>>> def f(x,y):
"""Returns 2 times x plus y"""
return 2*x+y
>>> f_vectorize = numpy.vectorize(f)
>>> f_frompyfunc = numpy.frompyfunc(f, 2, 1)
>>> f_vectorize.__doc__
'Returns 2 times x plus y'
>>> f_frompyfunc.__doc__
'f (vectorized)(x1, x2[, out])\n\ndynamic ufunc based on a python function'
>>> f_vectorize(a,2)
array([[ 2, 4, 6],
[ 4, 6, 8],
[ 6, 8, 10]])
>>> f_frompyfunc(a,2)
array([[2, 4, 6],
[4, 6, 8],
[6, 8, 10]], dtype=object)
`
I'm not sure what the different use cases for each is, but if you look at the source code (/numpy/lib/function_base.py), you'll see that vectorize wraps frompyfunc. My reading of the code is mostly that vectorize is doing proper handling of the input arguments. There might be particular instances where you would prefer one vs the other, but it would seem that frompyfunc is just a lower level instance of vectorize.
Although both methods provide you a way to build your own ufunc, numpy.frompyfunc method always returns a python object, while you could specify a return type when using numpy.vectorize method