How can these 2 loops be vectorized in Python? - python

I'm retrieving close to 400k values in values, which is pretty slow by itself (that code is not being shown), and then I try to do a prediction of those values through a Kalmann filter, the first loop is taking a little over a minute to run, and the second aroun 2 and half minutes, I think the first can be vectorized, but I'm not sure how, specially the window_sma. The second loop I'm not sure how I could deal with the i increasing the x array (x = np.append(x, new_x_col, axis=1)).
This is the first one, which tries to do a prediction based on the values from SMA, using polyfit and polyval:
window_sma = 200
sma_index = 500
offset = 50
SMA = talib.SMA(values, timeperiod = window_sma)
vector_X = [1, 2, 3, 15]
sma_predicted = []
start_time = time.time()
for i in range (sma_index, len(SMA)):
j = int(i - offset)
k = int(i - offset / 2)
window_sma = [SMA[j], SMA[k], SMA[i]]
polyfit = np.polyfit([1, 2, 3], window_sma, 2)
y_hat = np.polyval(polyfit, vector_X)
sma_predicted.append(y_hat[-1])
And the second one, which attemps to filter the output of the first for loop to have a better prediction of the values I got from SMA:
# Kalman Filter
km = KalmanFilter(dim_x = 2, dim_z = 1)
# state transition matrix
km.F = np.array([[1.,1.],
[0.,1.]])
# Measurement function
km.H = np.array([[1.,0.]])
# Change in time
dt = 0.0001
a = 1.5
# Covariance Matrix
km.Q = np.power(a, 2) * \
np.array([[np.power(dt,4)/4, np.power(dt,3)/2],
[np.power(dt,3)/2, np.power(dt,2)]])
# Variance
km.R = 1000
# Identity Matrix
I = np.array([[1, 0], [0, 1]])
# Measurement Matrix
km.Z = np.array(sma_predicted)
# Initial state
x = np.zeros((2,1))
x = np.array([[sma_predicted[0]], [0]])
# Initial distribution state's covariance matrix
km.P = np.array([[1000, 0], [0, 1000]])
for i in range (0, len(sma_predicted) - 1):
# Prediction
new_x_col = np.dot(km.F, x[:, i]).reshape(2, 1)
x = np.append(x, new_x_col, axis=1)
km.P = km.F * km.P * km.F.T + km.Q
# Correction
K = np.dot(km.P, km.H.T) / (np.dot(np.dot(km.H, km.P), km.H.T) + km.R)
x[:, -1] = x[:, -1] + np.dot(K, (km.Z[i + 1] - np.dot(km.H, x[:, -1])))
#x[:, -1] = (x[:, -1] + K * (km.Z[i + 1] - km.H * x[:, -1])).reshape(2, i + 2)
km.P = (I - K * km.H) * km.P
Thanks!

The second one is worth attacking first, so I'll just do that.
You have this:
x = np.array([[sma_predicted[0]], [0]])
for i in range (0, len(sma_predicted) - 1):
new_x_col = np.dot(km.F, x[:, i]).reshape(2, 1)
x = np.append(x, new_x_col, axis=1)
# ...
Repeatedly appending to the same array is always bad practice in NumPy, so start with something like this:
x = np.zeros((2, len(sma_predicted)))
x[0, 0] = sma_predicted[0]
for i in range(len(sma_predicted) - 1):
x[:, i+1] = np.dot(km.F, x[:, i])
# ...
Note the reshape(2, 1) is not needed, thanks to NumPy broadcasting.
I realize this does not answer all of your implicit questions, but perhaps it gets you started.
It would be nice if dot were a ufunc so we could do something like np.dot.outer(km.F, x.T), but it isn't (see this from 2009), so we can't. You could implement more speedups using Numba (with the append() removed as I showed, your code is a good candidate for Numba).

Related

Diagonal of a numpy matrix without compute the entire matrix

I have a simple algebric problem and I would like to solve it with numpy (of course that I could solve it easily with numba, but that is not the point).
Let us consider a first random matrix A with size (m x n), with n a big value, and a second random matrix B with size (n x n).
A = np.random.random((1E6, 1E2))
B = np.random.random((1E2, 1E2))
We want to compute the following expression:
np.diag(np.dot(np.dot(A,B),B.T))
The problem is that the entire matrix is loaded to the memory and only then is extracted the diagonal. Is it possible to do this operation in a more efficient way?
This is how I would approach it from your starting expression
np.diag(np.dot(np.dot(A,B),B.T))
You can start by grouping terms:
np.diag(np.dot(A, np.dot(B,B.T)))
then only use the first relevant (square) part of A:
np.diag(np.dot(A[:B.shape[0], :], np.dot(B,B.T)))
and then avoid the extra multiplications (that will fall out of the diagonal), by doing the element-wise multiplications yourself:
np.sum( np.multiply(A[:B.shape[0], :].T, np.dot(B,B.T)), 0)
Changed (A*B)*B.T to A*(B*B.T)
Multiplied only this part of A (A[:B.shape[0]]) that would result in the diagonal part of the matrix
import numpy as np
import time
A = np.random.random((1000_000, 100))
B = np.random.random((100, 100))
start_time = time.time()
result = np.diag(np.dot(np.dot(A, B), B.T))
print('Baseline: ', time.time() - start_time)
start_time = time.time()
for i in range(100):
result2 = np.diag(np.dot(A[:B.shape[0]], np.dot(B, B.T)))
print('Optimized: ', (time.time() - start_time) / 100)
stop = 1
assert np.allclose(result, result2)
Baseline: 1.7957241535186768
Optimized: 0.00016015291213989258
Yes.
N = 1E6
A = np.random.random((N, 1E2))
B = np.random.random((1E2, 1E2))
result = 0;
for i in range(N):
result += np.dot(np.dot(A[i,:], B[i,:])[i, :], B.T[i, :])
# Replacing B.T[i, :] with B[:, i].T might be a little more efficient
Explanation:
Say we have: K = np.dot(np.dot(A,B),B.T).
Then, K[0,0] = (A[0, :] * B[:,0])[0, :] * B.T[:])
Let X = (A[0, :] * B[:,0]), which is the [0, 0] element of np.dot(A,B)
Then X[0, :] * B.T[:, 0] is the [0, 0] element of np.dot(np.dot(A,B),B.T)
Then X[0, :] * B.T[:, 0] = (A[0, :] * B[:,0])[0, :] * B.T[:])
We can also generalize this result to: K[i,i] = (A[i, :] * B[:,i])[i, :] * B.T[:, i])

Regarding Filtering/looping and performing math in the same iteration of NumPy array

i am trying to filter/loop and perform math within the same iteration but cant seem to find the right answer.
I have a numpy array, that is size 6, 2, and consists of two values that i want to minus together, however i want the values filtered before the minus process commences.
So if the value is greater than in the other column, then the lowest value should be minused from the high value, and vise versa.
also this needs to happen in a loop which iterates through the array while performing the filtering and math.
This is my code example:
#minus price
print('minus price trying appending')
minus_p_orgp1 = np.append(dif_p_times1, fp, axis=0)
print(minus_p_orgp1)
for ii, vv in enumerate(minus_p_orgp1):
print('greater')
greater_1 = np.all(ii > 0, axis=0)
greater_0 = np.all(ii <= 0, axis=0)
if greater_1 < greater_0:
iit = greater_0 - greater_1
if greater_1 > greater_0:
iit = greater_1 - greater_0
print(iit, ii, vv)
ssss = np.zeros(minus_p_orgp1.size - 1)
for i in range(len(minus_p_orgp1) - 1):
if minus_p_orgp1[i] < minus_p_orgp1[i]:
ssss[i] = minus_p_orgp1[i + 1] - minus_p_orgp1[i]
elif minus_p_orgp1[i + 1] > minus_p_orgp1[i]:
ssss[i] = minus_p_orgp1[i] - minus_p_orgp1[i + 1]
print(ssss)
This is a print of the array where the upper vector is def_p_time1, and lower vector is fp:
minus price trying appending
[[79340.33057205 78379.24102508 72188.80527274 76557.26239563
72857.90423589 71137.7943199 ]
[43528.22 43705. 43931.07 44571.24
44330.43 44465.64 ]]
Any suggestions as to what i can do to achieve my goal?
I have also tried to do the process with just having the array being two seperate vectors with size 6, 1.
But that also seems very difficult, let me know what you think.
i have also just tried this:
however it just prints out zeros when running the code:
trii = np.array([[0, 0, 0, 0, 0, 0], [1, 1, 1, 1, 1, 1]])
print(trii)
print(minus_p_orgp1[~(trii >= 1)])
print('it works')
itt = minus_p_orgp1[~(trii >= 1)]
itt1 = minus_p_orgp1[~(trii >= 0)]
sssss = np.zeros(dif_p_times1.size - 1)
ssss = np.zeros(minus_p_orgp1.size - 1)
for i in range(len(dif_p_times1) - 1):
for ii in range(len(fp) - 1):
if itt < itt1:
sssss[i] = itt[i] + itt1[i + 1]
ssss[i, ii] = fp[ii + 1] - dif_p_times1[i]
elif itt > itt1:
sssss[i] = itt[i + 1] + itt1[i]
ssss[i, ii] = dif_p_times1[i] - fp[ii + 1]
print(sssss)
[[0 0 0 0 0 0]
[1 1 1 1 1 1]]
[63455.70703442 68744.47486851 77804.44752373 79686.34612013
69322.78250338 83255.08459329]
it does something
[0. 0. 0. 0. 0.]
new attempt however it still doesnt work:
ssss = np.zeros(minus_p_orgp1.size - 1)
x = minus_p_orgp1[::2]
y = minus_p_orgp1[::-2]
z = ssss[::2]
for z, x, y in range(len(minus_p_orgp1) - 1):
if x[i + 1] < y[i]:
z[i] = y[i + 1] - x[i]
elif x[i + 1] > y[i]:
z[i] = x[i + 1] - y[i]
print(z)
Best regards.
Mathias.

Struggling to implement the bezier quadratics in my code and was wondering if someone could take a look over it?

So here is what I have tried:
def bezier(a):
# find order of curve from number of control points
n = np.shape(a)[0]-1
# initialise arrays
B = np.zeros([101, 2])
terms = np.zeros([n+1, 2])
# create an array of values for t from 0 to 1 in 101 steps
t = np.linspace(0, 1, 101)
# loop through all t values
for i in range(0, 101):
#calculate terms inside sum in equation 13
for j in range(0, n + 1):
terms[j,:] = (1 - t[i])**2 * a[0,:] + 2 * t[i] * (1 - t[i]) * a[1,:] + t[i]**2 * a[2,:]
#sum terms to find Bezier curve
B[i, :] = sum(terms, 0)
# plot Bezier
pl.plot(B[:, 0], B[:, 1])
# plot control points
pl.plot(a[:, 0], a[:, 1],'ko')
# plot control polygon
pl.plot(a[:, 0], a[:, 1],'k')
return B
with:
a = np.array([[0, 0], [0.5, 1], [1, 0]])
B = bezier(a)
and it is returning:
this graph
which as you can see does not correspond to my control points
Any help appreciated, thanks.
The sum over j is redundant. What happens is that you create your Bezier curve but sum it three times, thus obtaining something that is three times as big as it should.
import numpy as np
import matplotlib.pyplot as pl
def bezier(a):
# find order of curve from number of control points
n = np.shape(a)[0]-1
# initialise arrays
B = np.zeros([101, 2])
terms = np.zeros([n+1, 2])
# create an array of values for t from 0 to 1 in 101 steps
t = np.linspace(0, 1, 101)
# loop through all t values
for i in range(0, 101):
#calculate terms inside sum in equation 13
B[i, :] = (1 - t[i])**2 * a[0,:] + 2 * t[i] * (1 - t[i]) * a[1,:] + t[i]**2 * a[2,:]
# plot Bezier
pl.plot(B[:, 0], B[:, 1])
# plot control points
pl.plot(a[:, 0], a[:, 1],'ko')
# plot control polygon
pl.plot(a[:, 0], a[:, 1],'k')
return B
a = np.array([[0, 0], [0.5, 1], [1, 0]])
B = bezier(a)
pl.show()
I would also recommend renaming a to something more descriptive such as controlPts.

Projection of matrix onto a simplex

I have problem with understanding this piece of code which based on the output, I guess it computes the eigenvector of the matrix.
def simplexProj(y):
"""
Given y, computes its projection x* onto the simplex
Delta = { x | x >= 0 and sum(x) <= 1 },
that is, x* = argmin_x ||x-y||_2 such that x in Delta.
x = SimplexProj(y)
****** Input ******
y : input vector.
****** Output ******
x : projection of y onto Delta.
"""
if len(y.shape) == 1: # Reshape to (1,-1) if y is a vector.
y = y.reshape(1, -1) # row vector
x = y.copy()
x[x < 0] = 0 #element within the matrix that is negative will be replaced with 0, python2 feature
K = np.flatnonzero(np.sum(x, 0) > 1) #return indices that are non-zero in the flattened version of a ; sum of each column
# K gives the column index for column that has colum sum>1, True = 1, False = 0
x[:, K] = blockSimplexProj(y[:, K])
return x
def blockSimplexProj(y):
""" Same as function SimplexProj except that sum(max(Y,0)) > 1. """
r, c = y.shape
ys = -np.sort(-y, axis=0) #sort each column of the matrix with biggest entry on the first row
mu = np.zeros(c, dtype=float)
S = np.zeros((r, c), dtype=float)
for i in range(1, r): #1st to r-1th row
S[i, :] = np.sum(ys[:i, :] - ys[i, :], 0)
print(S)
colInd_ge1 = np.flatnonzero(S[i, :] >= 1)
colInd_lt1 = np.flatnonzero(S[i, :] < 1)
if len(colInd_ge1) > 0:
mu[colInd_ge1] = (1 - S[i - 1, colInd_ge1]) / i - ys[i - 1, colInd_ge1]
if i == r:
mu[colInd_lt1] = (1 - S[r, colInd_lt1]) / (r + 1) - ys[r, colInd_lt1]
x = y + mu
x[x < 0] = 0
return x
I'm a bit puzzle by the step computing the matrix S because according to the code, the row of first row of S should be all 0. Take for example the matrix A = np.array([[25,70,39,10,80],[12,45,32,89,43],[67,24,84,39,21],[0.1,0.2,0.3,0.035,0.06]]) The 3 iterations (i=1,2,3) are computed as expected but then there is an extra step which seemingly gives back S as basis of eigenvectors. It would be great if somebody can help me with understanding this problem. Also I#m not sure what's the name of this algorithm (how S is computed)

Improve performance of function without parallelization

Some weeks ago I posted a question (Speed up nested for loop with elements exponentiation) which got a very good answer by abarnert. This question is related to that one since it makes use of the performance improvements suggested by said user.
I need to improve the performance of a function that involves calculating three factors and then applying an exponential on them.
Here's a MWE of my code:
import numpy as np
import timeit
def random_data(N):
# Generate some random data.
return np.random.uniform(0., 10., N)
# Data lists.
array1 = np.array([random_data(4) for _ in range(1000)])
array2 = np.array([random_data(3) for _ in range(2000)])
# Function.
def func():
# Empty list that holds all values obtained in for loop.
lst = []
for elem in array1:
# Avoid numeric errors if one of these values is 0.
e_1, e_2 = max(elem[0], 1e-10), max(elem[1], 1e-10)
# Obtain three parameters.
A = 1./(e_1*e_2)
B = -0.5*((elem[2]-array2[:,0])/e_1)**2
C = -0.5*((elem[3]-array2[:,1])/e_2)**2
# Apply exponential.
value = A*np.exp(B+C)
# Store value in list.
lst.append(value)
return lst
# time function.
func_time = timeit.timeit(func, number=100)
print func_time
Is it possible to speed up func without having to recurr to parallelization?
Here's what I have so far. My approach is to do as much of the math as possible across numpy arrays.
Optimizations:
Calculate As within numpy
Re-factor calculation of B and C by splitting them into factors, some of which can be computed within numpy
Code:
def optfunc():
e0 = array1[:, 0]
e1 = array1[:, 1]
e2 = array1[:, 2]
e3 = array1[:, 3]
ar0 = array2[:, 0]
ar1 = array2[:, 1]
As = 1./(e0 * e1)
Bfactors = -0.5 * (1 / e0**2)
Cfactors = -0.5 * (1 / e1**2)
lst = []
for i, elem in enumerate(array1):
B = ((elem[2] - ar0) ** 2) * Bfactors[i]
C = ((elem[3] - ar1) ** 2) * Cfactors[i]
value = As[i]*np.exp(B+C)
lst.append(value)
return lst
print np.allclose(optfunc(), func())
# time function.
func_time = timeit.timeit(func, number=10)
opt_func_time = timeit.timeit(optfunc, number=10)
print "%.3fs --> %.3fs" % (func_time, opt_func_time)
Result:
True
0.759s --> 0.485s
At this point I'm stuck. I managed to do it entirely without python for loops, but it is slower than the above version for a reason I do not yet understand:
def optfunc():
x = array1
y = array2
x0 = x[:, 0]
x1 = x[:, 1]
x2 = x[:, 2]
x3 = x[:, 3]
y0 = y[:, 0]
y1 = y[:, 1]
A = 1./(x0 * x1)
Bfactors = -0.5 * (1 / x0**2)
Cfactors = -0.5 * (1 / x1**2)
B = (np.transpose([x2]) - y0)**2 * np.transpose([Bfactors])
C = (np.transpose([x3]) - y1)**2 * np.transpose([Cfactors])
return np.transpose([A]) * np.exp(B + C)
Result:
True
0.780s --> 0.558s
However note that the latter gets you an np.array whereas the former only gets you a Python list... this might account for the difference but I'm not sure.

Categories

Resources