I am trying to fill an array with calculated values from functions defined earlier in my code. I started with a code that has a similar structure to the following:
from numpy import cos, sin, arange, zeros
a = arange(1000)
b = arange(1000)
def defcos(x):
return cos(x)
def defsin(x):
return sin(x)
a_len = len(a)
b_len = len(b)
result = zeros((a_len,b_len))
for i in xrange(b_len):
for j in xrange(a_len):
a_res = defcos(a[j])
b_res = defsin(b[i])
result[i,j] = a_res * b_res
I tried to use array representations of the functions, which ended up in the following change for the loop
a_res = defsin(a)
b_res = defcos(b)
for i in xrange(b_len):
for j in xrange(a_len):
result[i,j] = a_res[i] * b_res[j]
This is already significantly faster, than the first version. But is there a way to avoid the loop entirely? I have encountered those loops a couple of times in the past but never botheres as it was not critical in terms of speed. But this time it is the core component of something, which is looped through a couple of times more. :)
Any help would be appreciated, thanks in advance!
Like so:
from numpy import newaxis
a_res = sin(a)
b_res = cos(b)
result = a_res[:, newaxis] * b_res
To understand how this works, have a look at the rules for array broadcasting. And please don't define useless functions like defsin, just use sin itself! Another minor detail, you get i from range(b_len), but you use it to index a_res! This is a bug if a_len != b_len.
Related
I wonder how I could manage to implement an iterative root finder using fsolve over an interval, up until it found N roots ?
(Assuming I know how small the steps should be to get every roots during the procedure)
Is there a way to do so with a simple double for loop ?
Here is what it would look like for a "simple" function (cos(x)*x):
import numpy as np
from scipy.optimize import fsolve
import numpy as np
def f(x):
return np.cos(x)*x
for n in range(1,10) :
a = 0
k = 0
while k < 1000 :
k = fsolve(f,a)
if k == a :
a = a+0.01
k = fsolve(f,a)
else :
print(k)
But I can't make it works this way. I can't use chebpy because my real
function is more complexe (involving bessel function) and chepby doesnt
seem to accept such function as an argument.
Edit : Corrected indentation , this program yields 0 (first solution) an infinite number of time without stopping.
can you share your error?
may be something related to the indentation of your f(x) function, try changing your code to this:
def f(x):
return np.cos(x)*x
I found a solution that does the job as of now. It consist of passing an array corresponding to the search interval, and then sorting out the solutions (in my case, only looking at positive solutions, removing duplicates etc etc)
It might not be the best way but it works for me :
In this example i'm looking for the 10 first positive solutions of cos(x)*x=0
assuming they would be in [0,100].
import numpy as np
from scipy.optimize import fsolve
def f(x):
return np.cos(x)*x
int = np.arange(0,100,1)
arr = np.array([int])
roots=fsolve(f,arr)
# print(roots)
roots=np.around(roots, decimals=5, out=None)
a = roots[roots >= 0]
b = np.unique(a)
b=(b[:10])
print(b)
Result :
[ 0. 1.5708 4.71239 7.85398 10.99557 14.13717 17.27876 20.42035
23.56194 26.70354]
I had to use np.around otherwise np.unique would not works.
Thanks again.
Below code is running so slow. I tried using numpy.argwhere instead of "if statement" to speed up the code and I got a pretty efficient result but it's still very slow. I also tried numpy.frompyfunc and numpy.vectorize but I failed. What would you suggest to speed up the code below?
import numpy as np
import time
time1 = time.time()
n = 1000000
k = 10000
velos = np.linspace(-1000, 1000, n)
line_centers = np.linspace(-1000, 1000, k)
weights = np.random.random_sample(k)
rvs = np.arange(-60, 60, 2)
m = len(rvs)
w = np.arange(10)
M = np.zeros((n, m))
for l, lc in enumerate(line_centers):
vi = velos - lc
for j in range(m - 1):
w = np.argwhere((vi < rvs[j + 1]) & (vi > rvs[j])).T[0]
M[w, j] = weights[l] * (rvs[j + 1] - vi[w]) / (rvs[j + 1] - rvs[j])
M[w, j + 1] = weights[l] * (vi[w] - rvs[j]) / (rvs[j + 1] - rvs[j])
time2 = time.time()
print(time2 - time1)
EDIT:
The size of the array M was incorrect. I fixed it.
This seems like a situation where a c++ interface could come in handy. With Pybind11 you can create c++ functions which take numpy arrays as argument, manipulate them and return them back to python. That would speed up you loops. Take a look at it!
Of course it is slow, you have two nested loops! You need to rethink your algorithm using vector operations, as in, no iteration over indices, but implement in terms of index or boolean arrays, and index shifts.
You have not given any background information, so it is incredibly hard for anyone to suggest something meaningful (given the soup of indices in the example). A few quick suggestions based on quickly gleaning over your example.
An expression like this (rvs[j + 1] - rvs[j]) is easily replaced with numpy.ediff1d.
You seem to be iterating through n in blocks of m, maybe numpy.nditer will be of use.
I have a hunch that your inner loop has an error, are you sure you really mean to iterate over range(m - 1)? That would mean you are iterating from 0 to m-2 (inclusive), I doubt you meant that.
We can help with more concrete answers if you provide more background information.
Sorry for asking a probably very basic question.
Suppose I have defined a function say f whose domain is p-dim vector and I have a list of p-dim vector say A. How can I vectorize my compution to get f(A[0]),f(A[1]),...,f(A[len(A)])?
For example:
import numpy as np
def f(x):
return sum([x[i]*np.sin(x[i]) for i in range(len(x))])
A=[[i,i+1,i+2] for i in range(1000) ]
X=[f(A[i]) for i in range(len(A))]
How can I vectorize the computation above so that I get X faster?
I am not sure if you wanted to vectorize generation of list A as well. In case you are concerned with your function f(X), you might want to use Numpy's element-wise multiply method. Below is an example with benchmarking.
import timeit
import numpy as np
def f(x):
return sum([x[i]*np.sin(x[i]) for i in range(len(x))])
def f2(X):
return np.multiply(X, np.sin(X))
start = timeit.default_timer()
A=[[i,i+1,i+2] for i in range(10000) ]
X=[f(A[i]) for i in range(len(A))]
stop = timeit.default_timer()
print(stop - start)
start = timeit.default_timer()
A=[[i,i+1,i+2] for i in range(1000) ]
X=[f2(A[i]) for i in range(len(A))]
stop = timeit.default_timer()
print(stop - start)
The output is:
0.16681260999631756
0.017789075556770784
meaning that f2 is 10 times faster.
In order to vectorize this fully you have to adapt the function f so that it works on the whole array A. In your example you want the sum of x*sin(x) across all rows in the array.
The multiplication and sin function works elementwise so doesn't need to be changed. But you have to specify to np.sum that you want to sum across rows; this is done by setting the 'axis=-1'.
def f_vec(x):
return np.sum(x*np.sin(x), axis=-1)
You can pass the whole array A to this function and get X as before:
In [39]: X_vec = f_vec(A)
In [40]: np.all(X_vec == X)
Out[40]: True
I am new to vectorizing code, and I am really psyched about how much faster everything is, but I can't get the high speed out of this particular piece of code...
Here is the housing class...
class GaussianMixtureModel:
def __init__(self, image_matrix, num_components, means=None):
self.image_matrix = image_matrix
self.num_components = num_components
if(means is None):
self.means = np.zeros(num_components)
else:
self.means = np.array(means)
self.variances = np.zeros(num_components)
self.mixing_coefficients = np.zeros(num_components)
And here is what I've got so far that works:
def likelihood(self):
def g2(x):
#N =~ 5
#self.mixing_coefficients = 1D, N items
#self.variances = 1D, N items
#self.means = 1D, N items
mc = self.mixing_coefficients[:,None,None]
std = self.variances[:,None,None] ** 0.5
var = self.variances[:,None,None]
mean = self.means[:,None,None]
return np.log((mc*(1.0/(std*np.sqrt(2.0*np.pi)))*(np.exp(-((x-mean)**2.0)/(2.0*var)))).sum())
f = np.vectorize(g2)
#self.image_matrix =~ 400*700 2D matrix
log_likelihood = (f(self.image_matrix)).sum()
return log_likelihood
And here is what I've got that gives a strange result (note that self.image_matrix is an nxn matrix of a grayscale image):
def likelihood(self):
def g2():
#N =~ 5
#self.mixing_coefficients = 1D, N items
#self.variances = 1D, N items
#self.means = 1D, N items
#self.image_matrix = 1D, 400x700 2D matrix
mc = self.mixing_coefficients[:,None,None]
std = self.variances[:,None,None] ** 0.5
var = self.variances[:,None,None]
mean = self.means[:,None,None]
return np.log((mc*(1.0/(std*np.sqrt(2.0*np.pi)))*(np.exp(-((self.image_matrix-mean)**2.0)/(2.0*var)))).sum())
log_likelihood = (g2()).sum()
return log_likelihood
However, the second version is really fast compared to the first (which takes almost 10 seconds...and speed is really important here, because this is part of a convergence algorithm)
Is there a way to replicate the results of the first version and the speed of the second? (And I'm really not familiar enough with vectorizing to know why the second version isn't working)
The second version is so fast because it only uses the first cell of self.image_matrix:
return np.log((mc*(1.0/(std*np.sqrt(2.0*np.pi)))*(np.exp(-((self.image_matrix[0,0]-mean)**2.0)/(2.0*var)))).sum())
# ^^^^^
This is also why it's completely wrong. It's not actually a vectorized computation over self.image_matrix at all. Don't try to use its runtime as a point of comparison; you can always make wrong code faster than right code.
By eliminating the use of np.vectorize, you can make the first version much faster, but not as fast as the wrong code. The sum inside the log simply needs the appropriate axis specified:
def likelihood(self):
def f(x):
mc = self.mixing_coefficients[:,None,None]
std = self.variances[:,None,None] ** 0.5
var = self.variances[:,None,None]
mean = self.means[:,None,None]
return np.log((mc*(1.0/(std*np.sqrt(2.0*np.pi)))*(np.exp(-((x-mean)**2.0)/(2.0*var)))).sum(axis=0))
log_likelihood = (f(self.image_matrix)).sum()
This can be further simplified and optimized in a few ways. For example, the nested function can be eliminated, and multiplying by 1.0/whatever is slower than dividing by whatever, but eliminating np.vectorize is the big thing.
I have this code:
for j in xrange (j_start, self.max_j):
for i in xrange (0, self.max_i):
new_i = round (i + ((j - j_start) * discriminant))
if new_i >= self.max_i:
continue
self.grid[new_i, j] = standard[i]
and I want to speed it up by throwing away slow native python loops. There is possibility to use numpy vector operations instead, they are really fast. How to do that?
j_start, self.max_j, self.max_i, discriminant
int, int, int, float (constants).
self.grid
two-dimensional numpy array (self.max_i x self.max_j).
standard
one-dimensional numpy array (self.max_i).
Here is a complete solution, perhaps that will help.
jrange = np.arange(self.max_j - j_start)
joffset = np.round(jrange * discriminant).astype(int)
i = np.arange(self.max_i)
for j in jrange:
new_i = i + joffset[j]
in_range = new_i < self.max_i
self.grid[new_i[in_range], j+j_start] = standard[i[in_range]]
It may be possible to vectorize both loops but that will, I think, be tricky.
I haven't tested this but I believe it computes the same result as your code.