I am trying to make my own CFD solver and one of the most computationally expensive parts is solving for the pressure term. One way to solve Poisson differential equations faster is by using a multigrid method. The basic recursive algorithm for this is:
function phi = V_Cycle(phi,f,h)
% Recursive V-Cycle Multigrid for solving the Poisson equation (\nabla^2 phi = f) on a uniform grid of spacing h
% Pre-Smoothing
phi = smoothing(phi,f,h);
% Compute Residual Errors
r = residual(phi,f,h);
% Restriction
rhs = restriction(r);
eps = zeros(size(rhs));
% stop recursion at smallest grid size, otherwise continue recursion
if smallest_grid_size_is_achieved
eps = smoothing(eps,rhs,2*h);
else
eps = V_Cycle(eps,rhs,2*h);
end
% Prolongation and Correction
phi = phi + prolongation(eps);
% Post-Smoothing
phi = smoothing(phi,f,h);
end
I've attempted to implement this algorithm myself (also at the end of this question) however it is very slow and doesn't give good results so evidently it is doing something wrong. I've been trying to find why for too long and I think it's just worthwhile seeing if anyone can help me.
If I use a grid size of 2^5 by 2^5 points, then it can solve it and give reasonable results. However, as soon as I go above this it takes exponentially longer to solve and basically get stuck at some level of inaccuracy, no matter how many V-Loops are performed. at 2^7 by 2^7 points, the code takes way too long to be useful.
I think my main issue is that my implementation of a jacobian iteration is using linear algebra to calculate the update at each step. This should, in general, be fast however, the update matrix A is an n*m sized matrix, and calculating the dot product of a 2^7 * 2^7 sized matrix is expensive. As most of the cells are just zeros, should I calculate the result using a different method?
if anyone has any experience in multigrid methods, I would appreciate any advice!
Thanks
my code:
# -*- coding: utf-8 -*-
"""
Created on Tue Dec 29 16:24:16 2020
#author: mclea
"""
import numpy as np
import matplotlib.pyplot as plt
from scipy.signal import convolve2d
from mpl_toolkits.mplot3d import Axes3D
from scipy.interpolate import griddata
from matplotlib import cm
def restrict(A):
"""
Creates a new grid of points which is half the size of the original
grid in each dimension.
"""
n = A.shape[0]
m = A.shape[1]
new_n = int((n-2)/2+2)
new_m = int((m-2)/2+2)
new_array = np.zeros((new_n, new_m))
for i in range(1, new_n-1):
for j in range(1, new_m-1):
ii = int((i-1)*2)+1
jj = int((j-1)*2)+1
# print(i, j, ii, jj)
new_array[i,j] = np.average(A[ii:ii+2, jj:jj+2])
new_array = set_BC(new_array)
return new_array
def interpolate_array(A):
"""
Creates a grid of points which is double the size of the original
grid in each dimension. Uses linear interpolation between grid points.
"""
n = A.shape[0]
m = A.shape[1]
new_n = int((n-2)*2 + 2)
new_m = int((m-2)*2 + 2)
new_array = np.zeros((new_n, new_m))
i = (np.indices(A.shape)[0]/(A.shape[0]-1)).flatten()
j = (np.indices(A.shape)[1]/(A.shape[1]-1)).flatten()
A = A.flatten()
new_i = np.linspace(0, 1, new_n)
new_j = np.linspace(0, 1, new_m)
new_ii, new_jj = np.meshgrid(new_i, new_j)
new_array = griddata((i, j), A, (new_jj, new_ii), method="linear")
return new_array
def adjacency_matrix(rows, cols):
"""
Creates the adjacency matrix for an n by m shaped grid
"""
n = rows*cols
M = np.zeros((n,n))
for r in range(rows):
for c in range(cols):
i = r*cols + c
# Two inner diagonals
if c > 0: M[i-1,i] = M[i,i-1] = 1
# Two outer diagonals
if r > 0: M[i-cols,i] = M[i,i-cols] = 1
return M
def create_differences_matrix(rows, cols):
"""
Creates the central differences matrix A for an n by m shaped grid
"""
n = rows*cols
M = np.zeros((n,n))
for r in range(rows):
for c in range(cols):
i = r*cols + c
# Two inner diagonals
if c > 0: M[i-1,i] = M[i,i-1] = -1
# Two outer diagonals
if r > 0: M[i-cols,i] = M[i,i-cols] = -1
np.fill_diagonal(M, 4)
return M
def set_BC(A):
"""
Sets the boundary conditions of the field
"""
A[:, 0] = A[:, 1]
A[:, -1] = A[:, -2]
A[0, :] = A[1, :]
A[-1, :] = A[-2, :]
return A
def create_A(n,m):
"""
Creates all the components required for the jacobian update function
for an n by m shaped grid
"""
LaddU = adjacency_matrix(n,m)
A = create_differences_matrix(n,m)
invD = np.zeros((n*m, n*m))
np.fill_diagonal(invD, 1/4)
return A, LaddU, invD
def calc_RJ(rows, cols):
"""
Calculates the jacobian update matrix Rj for an n by m shaped grid
"""
n = int(rows*cols)
M = np.zeros((n,n))
for r in range(rows):
for c in range(cols):
i = r*cols + c
# Two inner diagonals
if c > 0: M[i-1,i] = M[i,i-1] = 0.25
# Two outer diagonals
if r > 0: M[i-cols,i] = M[i,i-cols] = 0.25
return M
def jacobi_update(v, f, nsteps=1, max_err=1e-3):
"""
Uses a jacobian update matrix to solve nabla(v) = f
"""
f_inner = f[1:-1, 1:-1].flatten()
n = v.shape[0]
m = v.shape[1]
A, LaddU, invD = create_A(n-2, m-2)
Rj = calc_RJ(n-2,m-2)
update=True
step = 0
while update:
v_old = v.copy()
step += 1
vt = v_old[1:-1, 1:-1].flatten()
vt = np.dot(Rj, vt) + np.dot(invD, f_inner)
v[1:-1, 1:-1] = vt.reshape((n-2),(m-2))
err = v - v_old
if step == nsteps or np.abs(err).max()<max_err:
update=False
return v, (step, np.abs(err).max())
def MGV(f, v):
"""
Solves for nabla(v) = f using a multigrid method
"""
# global A, r
n = v.shape[0]
m = v.shape[1]
# If on the smallest grid size, compute the exact solution
if n <= 6 or m <=6:
v, info = jacobi_update(v, f, nsteps=1000)
return v
else:
# smoothing
v, info = jacobi_update(v, f, nsteps=10, max_err=1e-1)
A = create_A(n, m)[0]
# calculate residual
r = np.dot(A, v.flatten()) - f.flatten()
r = r.reshape(n,m)
# downsample resitdual error
r = restrict(r)
zero_array = np.zeros(r.shape)
# interploate the correction computed on a corser grid
d = interpolate_array(MGV(r, zero_array))
# Add prolongated corser grid solution onto the finer grid
v = v - d
v, info = jacobi_update(v, f, nsteps=10, max_err=1e-6)
return v
sigma = 0
# Setting up the grid
k = 6
n = 2**k+2
m = 2**(k)+2
hx = 1/n
hy = 1/m
L = 1
H = 1
x = np.linspace(0, L, n)
y = np.linspace(0, H, m)
XX, YY = np.meshgrid(x, y)
# Setting up the initial conditions
f = np.ones((n,m))
v = np.zeros((n,m))
# How many V cyles to perform
err = 1
n_cycles = 10
loop = True
cycle = 0
# Perform V cycles until converged or reached the maximum
# number of cycles
while loop:
cycle += 1
v_new = MGV(f, v)
if np.abs(v - v_new).max() < err:
loop = False
if cycle == n_cycles:
loop = False
v = v_new
print("Number of cycles " + str(cycle))
plt.contourf(v)
I realize that I'm not answering your question directly, but I do note that you have quite a few loops that will contribute some overhead cost. When optimizing code, I have found the following thread useful - particularly the line profiler thread. This way you can focus in on "high time cost" lines and then start to ask more specific questions regarding opportunities to optimize.
How do I get time of a Python program's execution?
Related
I have coded Isomap function starting with computing the eulidean distance matrix (using scipy.spatial.distance.cdist), next basing on K-nearest neighbors method and Dijkstra algorithm (to determinate the shortest path) I have Computed the full distance matrix over all paths, finally I have did map computations, following by the dimensionality reduction.
BUT, I want to use epsilon instead of K-nearest neighbors like in the following :
Y = isomap (X, epsilon, d)
• X is an n × m matrix which corresponds to n points with m attributes.
• epsilon is an anonymous function of the distance matrix used to find the parameters of neighborhood. (The neighborhood graph must be formed by eliminating the edges whose width is greater to epsilon of the complete distance graph).
• d is a parameter which signifies the output dimension.
• Y is an n × d matrix, which signifies the embedding resulting from isomap.
THANKS in advance
import numpy as np
import matplotlib.pyplot as plt
from scipy.spatial.distance import cdist
def distance_Matrix(X):
return cdist(X,X,'euclidean')
def Dijkstra(h):
q = h.copy()
for i in range(ndata):
for j in range(ndata):
k = np.argmin(q[i,:])
while not(np.isinf(q[i,k])):
q[i,k] = np.inf
for l in neighbours[k,:]:
possible = h[i,l] + h[l,k]
if possible < h[i,k]:
h[i,k] = possible
k = np.argmin(q[i,:])
return h
def MDS(D,newdim=2):
n = D.shape[0]
# Torgerson formula
I = np.eye(n)
J = np.ones(D.shape)
J = I-(1/n)*J
B = (-1/2)*np.dot(np.dot(J,D),np.dot(D,J)) # B = -(1/2).JD²J
#
eigenval, eigenvec = np.linalg.eig(B)
indices = np.argsort(eigenval)[::-1]
eigenval = eigenval[indices]
eigenvec = eigenvec[:, indices]
# dimension reduction
K = eigenvec[:, :newdim]
L = np.diag(eigenval[:newdim])
# result
Y = K # L **(1/2)
return np.real(Y)
def isomap(data,newdim=2,K=12):
ndata = np.shape(data)[0]
ndim = np.shape(data)[1]
d = distance_Matrix(X)
# replace begin
# K-nearest neighbours
indices = d.argsort()
#notneighbours = indices[:,K+1:]
neighbours = indices[:,:K+1]
# replace end
h = np.ones((ndata,ndata),dtype=float)*np.inf
for i in range(ndata):
h[i,neighbours[i,:]] = d[i,neighbours[i,:]]
h = Dijkstra(h)
return MDS(h,newdim)
Try sklearn.neighbors.radius_neighbors_graph for your distance matrix
BACKGROUND: I am trying to build a real-time drum simulation model, for which I need really fast matrix-vector products. My matrices are of the size ~5000-10000 rows/cols, out of which only 6 entries per row are non-zero, hence I am inclined to use sparse matrices. I am using scipy.sparse module. The iterations are as below.
Vjk_plus_sparse = Vjk_minus_sparse.transpose()
Vj = Vjk_plus_sparse.dot(constant)
np.put(Vj, Nr, 0.0)
Uj[t] = Uj[t-1] + np.transpose(Vj)/fs
Vj_mat = adj_mat_sparse.multiply(Vj)
Vjk_minus_sparse = Vj_mat-Vjk_plus_sparse.multiply(end_gain)
Here, Vjk_plus_sparse, Vjk_minus_sparse and Vj_mat are sparse CSR matrices, Vj is a numpy array, and Uj is a numpy matrix where each row represents Uj(t). end_gain is an array which is a static numpy array for dampening of vibrations.
THE ISSUE: A single iteration takes about 3 ms for size = 4250. With the most significant
steps being the last 2 lines. They together take about 2.5 ms. I would ideally need it to run in 0.1 ms, which would be more than a 10x speedup. This is the maximum extent of vectorization possible for the problem, and I cannot parallelize as I am marching in time, at least physically it won't be accurate.
ATTEMPTS: I tried fiddling with the sparse data structures, and found best performance with all of them being CSR (Compressed Sparse Row), with the values as quoted above. I also tried to replace the multiply() method with a matrix multiplication, by repeating Vj, but that worsened the time, as the resultant operation would be a sparse*dense operation.
How can I speed this up within python itself? I am open to trying c++ as well, though migrating now would be a major pain. Also, since scipy is essentially based in c, would it even give that much of a speedup?
Added a complete runnable example
import matplotlib.pyplot as plt
import matplotlib as mpl
import matplotlib.patches
import math
from mpl_toolkits import mplot3d
import numpy as np
import scipy.sparse as sp
import scipy.fftpack as spf
import matplotlib.animation as animation
import time
sqrt_3 = 1.73205080757
class Pt:
def __init__(self,x_0,y_0):
self.x_0 = x_0
self.y_0 = y_0
self.id = -1
self.neighbours = []
self.distance = (x_0**2 + y_0**2)**0.5
class Circle:
def __init__(self,radius,center):
self.radius = radius
self.center = center
self.nodes = []
def construct_mesh(self, unit):
queue = [self.center]
self.center.distance = 0
curr_id = 0
delta = [(1.,0.), (1./2, (3**0.5)/2),(-1./2, (3**0.5)/2),(-1.,0.), (-1./2,-(3**0.5)/2), (1./2,- (3**0.5)/2)]
node_dict = {}
node_dict[(self.center.x_0,self.center.y_0)] = curr_id
self.nodes.append(self.center)
curr_id+=1
while len(queue)!=0:
curr_pt = queue[0]
queue.pop(0)
# self.nodes.append(curr_pt)
# curr_id+=1
for i in delta:
temp_pt = Pt(curr_pt.x_0 + 2*unit*i[0], curr_pt.y_0 + 2*unit*i[1])
temp_pt.id = curr_id
temp_pt.distance = (temp_pt.x_0 ** 2 + temp_pt.y_0 ** 2)**0.5
# curr_id+=1
if (round(temp_pt.x_0,5), round(temp_pt.y_0,5)) not in node_dict and temp_pt.distance <= self.radius:
# print(temp_pt.x_0, temp_pt.y_0)
self.nodes.append(temp_pt)
node_dict[(round(temp_pt.x_0,5), round(temp_pt.y_0,5))] = curr_id
curr_id+=1
queue.append(temp_pt)
curr_pt.neighbours.append(temp_pt.id)
elif temp_pt.distance <= self.radius:
curr_pt.neighbours.append(node_dict[round(temp_pt.x_0,5), round(temp_pt.y_0,5)])
# print(node_dict)
def plot_neighbours(self, pt):
x = []
y = []
x.append(pt.x_0)
y.append(pt.y_0)
for i in (pt.neighbours):
x.append(self.nodes[i].x_0)
y.append(self.nodes[i].y_0)
plt.scatter(x,y)
plt.axis('scaled')
def boundary_node_ids(self):
boundary_nodes = []
for j in range(len(self.nodes)):
if(len(self.nodes[j].neighbours) < 6):
boundary_nodes.append(j)
return boundary_nodes
def add_rim(self, boundary_node_ids, unit):
c = self.center
rim_ids = []
N = len(self.nodes)
for i in range(len(boundary_node_ids)):
d = self.nodes[boundary_node_ids[i]].distance
xp = self.nodes[boundary_node_ids[i]].x_0
yp = self.nodes[boundary_node_ids[i]].y_0
xnew = xp + xp*unit/d
ynew = yp + yp*unit/d
new_point = Pt(xnew, ynew)
new_point.id = N + i
rim_ids.append(N+i)
self.nodes.append(new_point)
self.nodes[boundary_node_ids[i]].neighbours.append(new_point.id)
self.nodes[N+i].neighbours.append(boundary_node_ids[i])
return rim_ids
def find_nearest_point(mesh, pt):
distances_from_center = np.zeros(len(mesh.nodes))
for i in xrange(len(mesh.nodes)):
distances_from_center[i] = mesh.nodes[i].distance
target_distance = pt.distance
closest_point_id = np.argmin(np.abs(distances_from_center-target_distance))
return closest_point_id
def init_impulse(mesh, impulse, Vj, poi, roi):
data = []
for i in range(len(Vj)):
r = ((mesh.nodes[i].x_0 - mesh.nodes[poi].x_0)**2 + (mesh.nodes[i].y_0 - mesh.nodes[poi].y_0)**2)**0.5
Vj[i] = max(0, impulse*(1. - (r/roi)))
if i in Nr:
Vj[i] = 0.
for k in mesh.nodes[i].neighbours:
data.append(np.asscalar(Vj[i])/2.)
return Vj, data
r = 0.1016 #Radius of drum head
# rho = 2500 #Density of drum head
thickness = 0.001 #Thickness of membrane
# tension = 1500 #Tension in membrane in N
param = 0.9
c = (param/thickness)**(0.5) #Speed of wave in string
duration = 0.25
fs = 4000
delta = c/fs
center = Pt(0,0)
point_of_impact = Pt(r/2., 0)
center.id = 0
mesh = Circle(r,center)
mesh.construct_mesh(delta)
N = len(mesh.nodes)
Nb = []
for j in range(N):
if len(mesh.nodes[j].neighbours) < 6:
Nb.append(j)
Nr = mesh.add_rim(Nb, delta)
N = len(mesh.nodes)
print(N)
row_ind = []
col_ind = []
for j in range(N):
for k in mesh.nodes[j].neighbours:
row_ind.append(j)
col_ind.append(k)
data = np.ones(len(col_ind))
adj_mat_sparse = sp.csr_matrix((data, (row_ind, col_ind)), shape = (N,N))
Vjk_plus = sp.csr_matrix([N, N])
Vj = np.zeros([N,1])
Uj = np.zeros([int(duration*fs), N])
Vj_mat = sp.csc_matrix([N,N])
closest_point_id = find_nearest_point(mesh, point_of_impact)
Vj, Vjk_data = init_impulse(mesh, -10.0, Vj, closest_point_id, r/10.)
Vjk_minus_sparse = sp.csr_matrix((Vjk_data, (row_ind, col_ind)), shape = (N,N))
constant = (1./3)*np.ones([N,1])
Vjk_plus = Vjk_minus_sparse.transpose()
np.put(Vj, Nr, 0.0)
Uj[1] = Uj[0] + np.transpose(Vj)/fs
Vj_mat = adj_mat_sparse.multiply(Vj)
Vjk_minus_sparse = Vj_mat - Vjk_plus
end_gain = np.ones([N,1])
end_gain[Nr] = 1.0
for t in range(2,int(duration*fs)):
Vjk_plus = Vjk_minus_sparse.transpose()
Vj = Vjk_plus.dot(constant)
np.put(Vj, Nr, 0.0)
Uj[t] = Uj[t-1] + np.transpose(Vj)/fs
Vj_mat = adj_mat_sparse.multiply(Vj)
Vjk_minus_sparse = Vj_mat-Vjk_plus.multiply(end_gain)
I am trying to implement a finite difference approximation to solve the Heat Equation, u_t = k * u_{xx}, in Python using NumPy.
Here is a copy of the code I am running:
## This program is to implement a Finite Difference method approximation
## to solve the Heat Equation, u_t = k * u_xx,
## in 1D w/out sources & on a finite interval 0 < x < L. The PDE
## is subject to B.C: u(0,t) = u(L,t) = 0,
## and the I.C: u(x,0) = f(x).
import numpy as np
import matplotlib.pyplot as plt
# parameters
L = 1 # legnth of the rod
T = 10 # terminal time
N = 10
M = 100
s = 0.25
# uniform mesh
x_init = 0
x_end = L
dx = float(x_end - x_init) / N
x = np.arange(x_init, x_end, dx)
x[0] = x_init
# time discretization
t_init = 0
t_end = T
dt = float(t_end - t_init) / M
t = np.arange(t_init, t_end, dt)
t[0] = t_init
# Boundary Conditions
for m in xrange(0, M):
t[m] = m * dt
# Initial Conditions
for j in xrange(0, N):
x[j] = j * dx
# definition of solution u(x,t) to u_t = k * u_xx
u = np.zeros((N, M+1)) # array to store values of the solution
# Finite Difference Scheme:
u[:,0] = x**2 #initial condition
for m in xrange(0, M):
for j in xrange(1, N-1):
if j == 1:
u[j-1,m] = 0 # Boundary condition
elif j == N-1:
u[j+1,m] = 0
else:
u[j,m+1] = u[j,m] + s * ( u[j+1,m] -
2 * u[j,m] + u[j-1,m] )
print u, #t, x
plt.plot(u, t)
#plt.show()
I think my code is working properly and it is producing an output. I want to plot the output of the solution u versus t (my time vector). If I can plot the graph then I am able to check if my numerical approximation agrees with the expected phenomena for the Heat Equation. However, I am getting the error that "x and y must have same first dimension". How can I correct this issue?
An additional question: Am I better off attempting to make an animation with matplotlib.animation instead of using matplotlib.plyplot ???
Thanks so much for any and all help! It is very greatly appreciated!
Okay so I had a "brain dump" and tried plotting u vs. t sort of forgetting that u, being the solution to the Heat Equation (u_t = k * u_{xx}), is defined as u(x,t) so it has values for time. I made the following correction to my code:
print u #t, x
plt.plot(u)
plt.show()
And now my programming is finally displaying an image. And here it is:
It is absolutely beautiful, isn't it?
i'm currently incredibly stuck on what isn't working in my code and have been staring at it for hours. I have created some functions to approximate the solution to the laplace equation adaptively using the finite element method then estimate it's error using the dual weighted residual. The error function should give a vector of errors (one error for each element), i then choose the biggest errors, add more elements around them, solve again and then recheck the error; however i have no idea why my error estimate isn't changing!
My first 4 functions are correct but i will include them incase someone wants to try the code:
def Poisson_Stiffness(x0):
"""Finds the Poisson equation stiffness matrix with any non uniform mesh x0"""
x0 = np.array(x0)
N = len(x0) - 1 # The amount of elements; x0, x1, ..., xN
h = x0[1:] - x0[:-1]
a = np.zeros(N+1)
a[0] = 1 #BOUNDARY CONDITIONS
a[1:-1] = 1/h[1:] + 1/h[:-1]
a[-1] = 1/h[-1]
a[N] = 1 #BOUNDARY CONDITIONS
b = -1/h
b[0] = 0 #BOUNDARY CONDITIONS
c = -1/h
c[N-1] = 0 #BOUNDARY CONDITIONS: DIRICHLET
data = [a.tolist(), b.tolist(), c.tolist()]
Positions = [0, 1, -1]
Stiffness_Matrix = diags(data, Positions, (N+1,N+1))
return Stiffness_Matrix
def NodalQuadrature(x0):
"""Finds the Nodal Quadrature Approximation of sin(pi x)"""
x0 = np.array(x0)
h = x0[1:] - x0[:-1]
N = len(x0) - 1
approx = np.zeros(len(x0))
approx[0] = 0 #BOUNDARY CONDITIONS
for i in range(1,N):
approx[i] = math.sin(math.pi*x0[i])
approx[i] = (approx[i]*h[i-1] + approx[i]*h[i])/2
approx[N] = 0 #BOUNDARY CONDITIONS
return approx
def Solver(x0):
Stiff_Matrix = Poisson_Stiffness(x0)
NodalApproximation = NodalQuadrature(x0)
NodalApproximation[0] = 0
U = scipy.sparse.linalg.spsolve(Stiff_Matrix, NodalApproximation)
return U
def Dualsolution(rich_mesh,qoi_rich_node): #BOUNDARY CONDITIONS?
"""Find Z from stiffness matrix Z = K^-1 Q over richer mesh"""
K = Poisson_Stiffness(rich_mesh)
Q = np.zeros(len(rich_mesh))
Q[qoi_rich_node] = 1.0
Z = scipy.sparse.linalg.spsolve(K,Q)
return Z
My error indicator function takes in an approximation Uh, with the mesh it is solved over, and finds eta = (f - Bu)z.
def Error_Indicators(Uh,U_mesh,Z,Z_mesh,f):
"""Take in U, Interpolate to same mesh as Z then solve for eta vector"""
u_inter = interp1d(U_mesh,Uh) #Interpolation of old mesh
U2 = u_inter(Z_mesh) #New function u for the new mesh to use in
Bz = Poisson_Stiffness(Z_mesh)
Bz = Bz.tocsr()
eta = np.empty(len(Z_mesh))
for i in range(len(Z_mesh)):
for j in range(len(Z_mesh)):
eta[i] += (f[i] - Bz[i,j]*U2[j])
for i in range(len(Z)):
eta[i] = eta[i]*Z[i]
return eta
My next function seems to adapt the mesh very well to the given error indicator! Just no idea why the indicator seems to stay the same regardless?
def Mesh_Refinement(base_mesh,tolerance,refinement,z_mesh,QOI_z_mesh):
"""Solve for U on a normal mesh, Take in Z, Find error indicators, adapt. OUTPUT NEW MESH"""
New_mesh = base_mesh
Z = Dualsolution(z_mesh,QOI_z_mesh) #Solve dual solution only once
f = np.empty(len(z_mesh))
for i in range(len(z_mesh)):
f[i] = math.sin(math.pi*z_mesh[i])
U = Solver(New_mesh)
eta = Error_Indicators(U,base_mesh,Z,z_mesh,f)
while max(abs(k) for k in eta) > tolerance:
orderedeta = np.sort(eta) #Sort error indicators LENGTH 40
biggest = np.flipud(orderedeta[int((1-refinement)*len(eta)):len(eta)])
position = np.empty(len(biggest))
ratio = float(len(New_mesh))/float(len(z_mesh))
for i in range(len(biggest)):
position[i] = eta.tolist().index(biggest[i])*ratio #GIVES WHAT NUMBER NODE TO REFINE
refine = np.zeros(len(position))
for i in range(len(position)):
refine[i] = math.floor(position[i])+0.5 #AT WHAT NODE TO PUT NEW ELEMENT 5.5 ETC
refine = np.flipud(sorted(set(refine)))
for i in range(len(refine)):
New_mesh = np.insert(New_mesh,refine[i]+0.5,(New_mesh[refine[i]+0.5]+New_mesh[refine[i]-0.5])/2)
U = Solver(New_mesh)
eta = Error_Indicators(U,New_mesh,Z,z_mesh,f)
print eta
An example input for this would be:
Mesh_Refinement(np.linspace(0,1,3),0.1,0.2,np.linspace(0,1,60),20)
I understand there is alot of code here but i am at a loss, i have no idea where to turn!
Please consider this piece of code from def Error_Indicators:
eta = np.empty(len(Z_mesh))
for i in range(len(Z_mesh)):
for j in range(len(Z_mesh)):
eta[i] = (f[i] - Bz[i,j]*U2[j])
Here you override eta[i] each j iteration, so the inner cycle proves useless and you can go directly to the last possible j. Did you mean to find a sum of the (f[i] - Bz[i,j]*U2[j]) series?
eta = np.empty(len(Z_mesh))
for i in range(len(Z_mesh)):
for j in range(len(Z_mesh)):
eta[i] += (f[i] - Bz[i,j]*U2[j])
This code is taking more than half an hour for a data set of 200000 floats.
import numpy as np
try:
import progressbar
pbar = progressbar.ProgressBar(widgets=[progressbar.Percentage(),
progressbar.Counter('%5d'), progressbar.Bar(), progressbar.ETA()])
except:
pbar = list
block_length = np.loadtxt('bb.txt.gz') # get data file from http://filebin.ca/29LbYfKnsKqJ/bb.txt.gz (2MB, 200000 float numbers)
N = len(block_length) - 1
# arrays to store the best configuration
best = np.zeros(N, dtype=float)
last = np.zeros(N, dtype=int)
log = np.log
# Start with first data cell; add one cell at each iteration
for R in pbar(range(N)):
# Compute fit_vec : fitness of putative last block (end at R)
#fit_vec = fitfunc.fitness(
T_k = block_length[:R + 1] - block_length[R + 1]
#N_k = np.cumsum(x[:R + 1][::-1])[::-1]
N_k = np.arange(R + 1, 0, -1)
fit_vec = N_k * (log(N_k) - log(T_k))
prior = 4 - log(73.53 * 0.05 * ((R+1) ** -0.478))
A_R = fit_vec - prior #fitfunc.prior(R + 1, N)
A_R[1:] += best[:R]
i_max = np.argmax(A_R)
last[R] = i_max
best[R] = A_R[i_max]
# Now find changepoints by iteratively peeling off the last block
change_points = np.zeros(N, dtype=int)
i_cp = N
ind = N
while True:
i_cp -= 1
change_points[i_cp] = ind
if ind == 0:
break
ind = last[ind - 1]
change_points = change_points[i_cp:]
print edges[change_points] # show result
The first loop is very slow because the length of arrays is R at every iteration, i.e. increasing, leading to N^2 complexity.
Is there any way to optimize this code further, e.g. through pre-computation? I am also happy with solutions using other programming languages.
I can replicate A_R (up to the fit-prior step) as a upper triangular NxN matrix with:
def trilog(n):
nn = n[:-1,None]-n[None,1:]
nn[np.tril_indices_from(nn,-1)]=1
return nn
T_k = trilog(block_length)
N_k = trilog(-np.arange(N+1))
fit_vec = N_k * (np.log(N_k) - np.log(T_k))
R = np.arange(N)+1
prior = 4 - log(73.53 * 0.05 * (R ** -0.478))
A_R = fit_vec - prior
A_R = np.triu(A_R,0)
print(A_R)
I haven't worked through the logic of calculation and applying best.
I've only done this with small arrays. For your full problem, the corresponding matrix is too large for my memory.
B=np.ones((200000,200000),float)
So just from memory considerations you might be stuck with the for R in range(N) iteration.