Say I have the following simple function that I use to generate random numbers:
def my_func():
rvs = np.random.random(size=3)
return rvs[2] - rvs[1]
I want to call this function a number of times, lets say 1000 and I want to store the results in an array, for example:
result = []
for _ in range(1000):
result += [my_func()]
Is there a way to use numpy to vectorize this operation and make everything faster? I don't mind if the workflow changes.
If I understand your question correctly, you just need to use the np.random.rand function:
np.random.rand(1000)
This function create an array of the given shape and populate it with random samples from a uniform distribution over [0, 1).
You can vectorize as follows:
rvs_vect = np.random.rand(10000, 3)
result = rvs_vect[:,2] - rvs_vect[:,1]
rvs_vect[:,1] selects all rows in column 1.
rvs_vect[:,2] selects all rows in column 2.
Execution times for instances of 10000 elements on my machine are about 100 times faster than your solution and the other proposed ones (np.vectorize and list comprehension).
Extras
I have prepared an example for you with Numba. Numba is an open source JIT compiler that translates a subset of Python and NumPy code into fast machine code.
Although you will not gain substantial advantages over numpy on this type of operation.
import numba as nb
nb.njit
def my_rand(n):
rvs_vect = np.random.rand(n, 3)
return rvs_vect[:,2] - rvs_vect[:,1]
You could try: result = [my_func() for i in range(1000)], it is already fast enough
Try this:
import numpy as np
def my_func(arr):
rvs = np.random.random(size=3)
return rvs[2] - rvs[1]
vfunc = np.vectorize(my_func)
result = []
result.append(vfunc([1]*1000))
print(result)
Hope it hepls
Explanation:
np.vectorize is vectorizing the function. In normal cases you will pass a numpy array to a function that performs some task on its elements, but here I just passed an anonymous list for the function to be executed 1000 times rest is as you were doing
Related
I have some complicated function called dis(x), which returns a number.
I am making two lists called let's say ,,indices'' and ,,values''. So what I do is following:
for i in np.arange(0.01,4,0.01):
values.append(dis(i))
indices.append(i)
So i have following problem, how do i find some index j (from indices), which dis(j) (from values) is closest to some number k.
Combination of enumerate and the argmin function in numpy will do the job for you.
import numpy as np
values = []
indices = []
def dis(x):
return 1e6*x**2
for i in np.arange(0.01,4,0.01):
values.append(dis(i))
indices.append(i)
target = 10000
closest_index = np.argmin([np.abs(x-target) for x in values])
print(closest_index)
The way you are stating it, I see two options:
Brute force it (try many indices i, and then see which dis(i) ended up closest to k. Works best when dis is reasonably fast, and the possible indices are reasonably few.
Learn about optimization: https://en.wikipedia.org/wiki/Optimization_problem. This is a pretty extensive field, but the python SciPy packages has many optimization functions.
Using numpy
closestindice = np.argmin(np.abs(np.array(values)-k))
But it is a strange as it does not use the 'indices' list.
May be you could skip the definition of the 'indices' list and and get the values in a numpy array.
import numpy as np
def dis(i):
return ((i-1)**2)
nprange = np.arange(0.01, 4, 0.01)
npvalues = np.array([dis(x) for x in nprange])
k = .5
closestindice = np.abs(npvalues-k).argmin()
print(closestindice, npvalues[closestindice])
Output:
28 0.5041
By the way, if 'dis' function is not monotone on the range, you could have more than one correct answers on both side of a local extremum.
I am writing code which utilizes Numba to JIT compile my python code.
The function takes in two arrays of same length as input, randomly selects a slicing point and returns a tuple with two Frankenstein array formed by parts of the two input strings.
Numba however does not yet support the numpy.concatenate function (don't know if it ever will). As I am unwilling to drop Numpy, does anyone know a performant solution for concatenating two Numpy arrays without the concatenate function?
def randomSlice(str1, str2):
lenstr = len(str1)
rnd = np.random.randint(1, lenstr)
return (np.concatenate((str1[:rnd], str2[rnd:])), np.concatenate((str2[:rnd], str1[rnd:])))
This might work for you:
import numpy as np
import numba as nb
#nb.jit(nopython=True)
def randomSlice_nb(str1, str2):
lenstr = len(str1)
rnd = np.random.randint(1, lenstr)
out1 = np.empty_like(str1)
out2 = np.empty_like(str1)
out1[:rnd] = str1[:rnd]
out1[rnd:] = str2[rnd:]
out2[:rnd] = str2[:rnd]
out2[rnd:] = str1[rnd:]
return (out1, out2)
On my machine, using Numba 0.27 and timing via the timeit module to make sure I'm not counting the jit time in the stats (or you could run it once, and then time subsequent calls), the numba version gives a small but non-negligible performance increase on various size input arrays of ints or floats. If the arrays have a dtype of something like |S1, then numba is significantly slower. The Numba team has spent very little time optimizing non-numeric usecases so this isn't terribly surprising. I'm a little unclear about the exact form of your input arrays str1 and str2, so I can't exactly guarantee that the code will work for your specific usecase.
I have a big 2D NumPy array, let's say 5M rows and 10 columns. I want to build a few more columns according to some stateful logic implemented using Numba #jitclass. Let's say there are 50 such new columns to create. The idea is to iterate over all the rows of 10 columns in a Numba #jit function, and for each row, apply each of my 50 "filters" to generate one new cell each. So:
Source1..Source10 Derived1..Derived50
[array of 10 inputs] [array of 50 outputs]
... 5 million rows like this ...
The problem is, I can't pass a list or tuple of my "filters" to an #jit(nopython=True) function, because they are not homogenous:
#numba.jit(nopython=True)
def calc_derived(source, derived, filters):
for srcidx, src in enumerate(source):
for filtidx, filt in enumerate(filters): # doesn't work
derived[srcidx,filtidx] = filt.transform(src)
The above doesn't work because filters are a bunch of different classes. As far as I can tell, even making them derive from a common base class is not good enough.
I am left with the possibility of swapping the order of the loops, and having the loop over the 50 filters outside of the #jit function, but this would mean the entire source dataset would be loaded 50 times instead of once, which is very wasteful.
Do you have a technique to work around the "homogenous lists only" requirement of Numba?
You originally asked about doing this with a single function that loops over rows, and applies a list of filters to each row. A challenge with this approach is that numba needs to know or be able to infer the input/output types of each function. I'm not aware of a way to satisfy numba's requirement in this situation (which is not to say that none exists). If there were a way to do this, it could be a better solution (and I'd like to know what it is).
An alternative is to move the code that loops over rows into the filters themselves. Because the filters are numba functions, this should maintain speed. The function that applies the filters would longer use numba; it would simply loop over the list of filters. But, because the number of filters is small relative to the size of the data matrix, hopefully this won't impact speed too severely. Because this function no longer uses numba, the 'heterogeneous list' issue would no longer be a problem.
This approach worked when I tested it (nopython mode is fine). In test cases, filters implemented as numba functions were 10-18x faster than filters implemented as class methods (even though classes were implemented as numba jitclasses; not sure what's going on there). To gain a bit of modularity, filters can be constructed as closures, so that similar filters can be defined using different parameters.
For example, here are filters that compute sums of powers. Given a matrix x, the filter operates over the columns of x, giving an output for each row. It returns a vector v, where v[i] = sum(x[i, :] ** power)
# filter constructor
def sumpow(power):
#numba.jit(nopython=True)
def run_filter(x):
(nrows, ncols) = x.shape
result = np.zeros(nrows)
for i in range(nrows):
for j in range(ncols):
result[i] += x[i,j] ** power
return result
return run_filter
# define filters
sum1 = sumpow(1) # sum of elements
sum2 = sumpow(2) # sum of elements squared
# apply a single filter
v = sum2(x)
The function to apply multiple filters looks like this. The output of each filter is stacked into a column of the output.
def apply_filters(x, filters):
result = np.empty((x.shape[0], len(filters)))
for (i, f) in enumerate(filters):
result[:, i] = f(x)
return result
y = apply_filters(x, [sum1, sum2])
Timing results
Data matrix: random entries drawn from standard normal distribution, float64, 5 million rows x 10 columns. All methods tested using the same matrix.
Filters: sum2 filter above, repeated 20x in a list: [sum2, sum2, ...]
Timed using IPython's %timeit function, best of 3 runs
Numerical outputs of all methods agree
Numba function filters (as shown above): 2.25s
Numba jitclass filters: 28.3s
Pure NumPy (using vectorized ops, no loops): 8.64s
I imagine Numba might gain relative to NumPy for more complex filters.
To get a homogeneous list, you could construct a list of the transform functions of all filters. In this case, all list elements would would have type method.
# filters = list of filters
transforms = [x.transform for x in filters]
Then pass transforms to calc_derived() instead of filters.
Edit:
On my system, looks like numba will accept this, but only if nopython=False
I am trying to create a custom filter to run it with the generic filter from SciPy package.
scipy.ndimage.filters.generic_filter
The problem is that I don't know how to get the returned value to be a scalar, as it needs for the generic function to work. I read through these threads (bottom), but I can't find a way for my function to perform.
The code is this:
import scipy.ndimage as sc
def minimum(window):
list = []
for i in range(window.shape[0]):
window[i] -= min(window)
list.append(window[i])
return list
test = np.ones((10, 10)) * np.arange(10)
result = sc.generic_filter(test, minimum, size=3)
It gives the error:
cval, origins, extra_arguments, extra_keywords)
TypeError: a float is required
Scipy filter with multi-dimensional (or non-scalar) output
How to apply ndimage.generic_filter()
http://ilovesymposia.com/2014/06/24/a-clever-use-of-scipys-ndimage-generic_filter-for-n-dimensional-image-processing/
If I understand, you want to substract each pixel the min of its 3-horizontal neighbourhood. It's not a good practice to do that with lists, because numpy is for efficiency( ~100 times faster ). The simplest way to do that is just :
test-sc.generic_filter(test, np.min, size=3)
Then the substraction is vectorized on the whole array.
You can also do:
test-np.min([np.roll(test,1),np.roll(test,-1),test],axis=0)
10 times faster, if you accept the artefact at the border.
Using the example in Scipy filter with multi-dimensional (or non-scalar) output I converted your code to:
def minimum(window,out):
list = []
for i in range(window.shape[0]):
window[i] -= min(window)
list.append(window[i])
out.append(list)
return 0
test = np.ones((10, 10)) * np.arange(10)
result = []
sc.generic_filter(test, minimum, size=3, extra_arguments=(result,))
Now your function minimum outputs its result to the parameter out, and the return value is not used anymore. So the final result matrix contains all the results concatenated, not the output of generic_filter.
Edit 1: Using the generic_filter with a function that returns a scalar, a matrix of the same dimensions is returned. In this case however the lists are appended of each call by the filter which results in a larger matrix (100x9 in this case).
I had a pretty compact way of computing the partition function of an Ising-like model using itertools, lambda functions, and large NumPy arrays. Given a network consisting of N nodes and Q "states"/node, I have two arrays, h-fields and J-couplings, of sizes (N,Q) and (N,N,Q,Q) respectively. J is upper-triangular, however. Using these arrays, I have been computing the partition function Z using the following method:
# Set up lambda functions and iteration tuples of the form (A_1, A_2, ..., A_n)
iters = itertools.product(range(Q),repeat=N)
hf = lambda s: h[range(N),s]
jf = lambda s: np.array([J[fi,fj,s[fi],s[fj]] \
for fi,fj in itertools.combinations(range(N),2)]).flatten()
# Initialize and populate partition function array
pf = np.zeros(tuple([Q for i in range(N)]))
for it in iters:
hterms = np.exp(hf(it)).prod()
jterms = np.exp(-jf(it)).prod()
pf[it] = jterms * hterms
# Calculates partition function
Z = pf.sum()
This method works quickly for small N and Q, say (N,Q) = (5,2). However, for larger systems (N,Q) = (18,3), this method cannot even create the pf array due to memory issues because it has Q^N nontrivial elements. Any ideas on how to either overcome this memory issue or how to alter the code to work on subarrays?
Edit: Made a small mistake in the definition of jf. It has been corrected.
You can avoid the large array just by initializing Z to 0, and incrementing it by jterms * iterms in each iteration. This still won't get you out of calculating and summing Q^N numbers, however. To do that, you probably need to figure out a way to simplify the partition function algebraically.
Not sure what you are trying to compute but I tested your code with ChrisB suggestion and jf will not work for Q=3.
Perhaps you shouldn't use a dense numpy array to encode your function? You could try sparse arrays or just straight Python with Numba compilation. This blogpost shows using Numba on the simple Ising model with good performance.