Is it possible to optimize this dynamic programming code?

Is it possible to optimize this dynamic programming code? - python

This code is taking more than half an hour for a data set of 200000 floats.
import numpy as np
try:
import progressbar
pbar = progressbar.ProgressBar(widgets=[progressbar.Percentage(),
progressbar.Counter('%5d'), progressbar.Bar(), progressbar.ETA()])
except:
pbar = list
block_length = np.loadtxt('bb.txt.gz') # get data file from http://filebin.ca/29LbYfKnsKqJ/bb.txt.gz (2MB, 200000 float numbers)
N = len(block_length) - 1
# arrays to store the best configuration
best = np.zeros(N, dtype=float)
last = np.zeros(N, dtype=int)
log = np.log
# Start with first data cell; add one cell at each iteration
for R in pbar(range(N)):
# Compute fit_vec : fitness of putative last block (end at R)
#fit_vec = fitfunc.fitness(
T_k = block_length[:R + 1] - block_length[R + 1]
#N_k = np.cumsum(x[:R + 1][::-1])[::-1]
N_k = np.arange(R + 1, 0, -1)
fit_vec = N_k * (log(N_k) - log(T_k))
prior = 4 - log(73.53 * 0.05 * ((R+1) ** -0.478))
A_R = fit_vec - prior #fitfunc.prior(R + 1, N)
A_R[1:] += best[:R]
i_max = np.argmax(A_R)
last[R] = i_max
best[R] = A_R[i_max]
# Now find changepoints by iteratively peeling off the last block
change_points = np.zeros(N, dtype=int)
i_cp = N
ind = N
while True:
i_cp -= 1
change_points[i_cp] = ind
if ind == 0:
break
ind = last[ind - 1]
change_points = change_points[i_cp:]
print edges[change_points] # show result
The first loop is very slow because the length of arrays is R at every iteration, i.e. increasing, leading to N^2 complexity.
Is there any way to optimize this code further, e.g. through pre-computation? I am also happy with solutions using other programming languages.

I can replicate A_R (up to the fit-prior step) as a upper triangular NxN matrix with:
def trilog(n):
nn = n[:-1,None]-n[None,1:]
nn[np.tril_indices_from(nn,-1)]=1
return nn
T_k = trilog(block_length)
N_k = trilog(-np.arange(N+1))
fit_vec = N_k * (np.log(N_k) - np.log(T_k))
R = np.arange(N)+1
prior = 4 - log(73.53 * 0.05 * (R ** -0.478))
A_R = fit_vec - prior
A_R = np.triu(A_R,0)
print(A_R)
I haven't worked through the logic of calculation and applying best.
I've only done this with small arrays. For your full problem, the corresponding matrix is too large for my memory.
B=np.ones((200000,200000),float)
So just from memory considerations you might be stuck with the for R in range(N) iteration.

Related

How to put number of a particular element in a particular row and column constraints in a matrix?

I have to basically create a symmetric matrix of NxN. Which has 1s and 0s randomly populated into it. However the only constraint is I need only one '1' in any row and any column.
I wrote a code to generate the matrix but it has more than one '1' in any row or column. I need to follow the constraint mentioned above, how can i modify my code?
import numpy as np
N = int(input("Enter the number of row and col:"))
my_matrix = np.random.randint(2,size=(N,N))
print(my_matrix)

TL;DR
Each result is generated with equal probability and run with O(n) time complexity:
import random
_prob_cache = [1, 1]
def prob(n):
try:
return _prob_cache[n]
except IndexError:
pass
for i in range(len(_prob_cache) - 1, n):
_prob_cache.append(1 / (i * _prob_cache[-1] + 1))
return _prob_cache[-1]
def symmetric_permutation(n):
res = np.zeros((n, n), int)
remain = list(range(n))
while remain:
m = len(remain)
diag_prob = prob(m)
row = remain.pop()
rnd = random.random()
if rnd < diag_prob:
col = row
else:
nondiag_prob = (1 - diag_prob) / (m - 1)
idx = int((rnd - diag_prob) / nondiag_prob)
remain[idx], remain[-1] = remain[-1], remain[idx]
col = remain.pop()
res[row, col] = res[col, row] = 1
return res
Long Answer
Begin with some derivation:
Let f(n) be the number of all setting schemes of n * n matrix. Obviously, we have:
f(1) = 1
Then take a convention:
f(0) = 1
For n > 1, I can extract a position from any row and set it to 1. There are two cases:
If 1 is on the diagonal, we can remove the row and column of this 1 and continue to set on the remaining (n - 1) * (n - 1) matrix, so the number of remaining setting schemes is f(n - 1).
If 1 is not on the diagonal, the symmetrical part also needs to be set to 1. Then we can remove the row and column where the two 1's are located. We need to continue to set the remaining (n - 2) * (n - 2) matrix. Therefore, the number of remaining setting schemes is f(n - 2).
Therefore, we can deduce:
f(n) = f(n - 1) + (n - 1) * f(n - 2)
According to the above strategy, if we want to make every setting scheme appear with equal probability, we should give different weights to diagonal index and other indices when selecting index. The weight of the diagonal index should be:
p(n) = f(n - 1) / f(n)
Therefore:
f(n) = f(n - 1) + (n - 1) * f(n - 2)
f(n) (n - 1) * f(n - 2)
=> -------- = 1 + ------------------
f(n - 1) f(n - 1)
1
=> ---- = 1 + (n - 1) * p(n - 1)
p(n)
1
=> p(n) = ------------------
(n - 1) * p(n - 1)
The probability function code is as follows:
_prob_cache = [1, 1]
def prob(n):
"""
Iterative version to prevent stack overflow caused by recursion.
Old version:
#lru_cache
def prob(n):
if n == 1:
return 1
else:
return 1 / ((n - 1) * prob(n - 1) + 1)
"""
try:
return _prob_cache[n]
except IndexError:
pass
for i in range(len(_cache) - 1, n):
_prob_cache.append(1 / (i * _prob_cache[-1] + 1))
return _prob_cache[-1]
The weight of the non diagonal index is:
f(n - 2) f(n - 2) f(n - 1)
-------- = -------- * -------- = p(n - 1) * p(n)
f(n) f(n - 1) f(n)
or
f(n - 2) 1 - p(n)
-------- = --------
f(n) n - 1
Here I choose to use the latter to call the function less once.
Specific implementation:
We use a list to store the indices that can still be used. In each loop, we take the last element of the list as the row index (unlike previously said to select the first element, which can speed up the removal of elements from the list), calculate the weight of the two cases and obtain the column index randomly, sets the value of the corresponding position, and removes the used index from the list until the list is empty:
import random
import numpy as np
def symmetric_permutation(n):
res = np.zeros((n, n), int)
remain = list(range(n))
while remain:
m = len(remain)
diag_prob = prob(m)
row = remain.pop()
rnd = random.random()
if rnd < diag_prob:
col = row
else:
nondiag_prob = (1 - diag_prob) / (m - 1)
col = remain.pop(int((rnd - diag_prob) / nondiag_prob))
res[row, col] = res[col, row] = 1
return res
Optimize to O(n) time complexity:
If we do not consider the creation of the zero matrix, the time complexity of the above policy is O(n^2), because every time we have a high probability of removing an index from the list.
However, violent removal is unnecessary. We have no requirements on the order of the remaining indices because the selection of row index does not affect the randomness of column index. Therefore, a cheaper solution is to overwrite the selected column index with the last element, and then remove the last element. This makes the O(n) operation of removing intermediate elements become O(1) operation, so the time complexity becomes O(n):
def symmetric_permutation(n):
res = np.zeros((n, n), int)
remain = list(range(n))
while remain:
m = len(remain)
diag_prob = prob(m)
row = remain.pop()
rnd = random.random()
if rnd < diag_prob:
col = row
else:
nondiag_prob = (1 - diag_prob) / (m - 1)
idx = int((rnd - diag_prob) / nondiag_prob)
remain[idx], remain[-1] = remain[-1], remain[idx]
col = remain.pop()
res[row, col] = res[col, row] = 1
return res
Probability test:
Here we prepare another function to calculate f(n) for the following test:
def f(n):
before_prev, prev = 1, 1
for i in range(1, n):
before_prev, prev = prev, prev + before_prev * i
return prev
Next is a probability test to verify whether the results are uniform enough. Here I take n=8 and build matrix 500_000 times, use the column index of 1 in each row as the identification of each result, and draw a line graph and histogram of the number of occurrences of each result:
from collections import Counter
import matplotlib.pyplot as plt
random.seed(0)
n = 8
times = 500_000
n_bin = 30
cntr = Counter()
cntr.update(tuple(symmetric_permutation(n).nonzero()[1]) for _ in range(times))
assert len(cntr) == f(n)
plt.subplot(2, 1, 1).plot(cntr.values())
plt.subplot(2, 1, 2).hist(cntr.values(), n_bin)
plt.show()
It can be seen from sub figure 1 that the number of occurrences of each result is roughly within the range of 650 ± 70, and it can be observed from sub figure 2 that the distribution of the number of occurrences of each result is close to the Gaussian distribution:
For #AndrzejO 's answer, the same code test is used here, and his solution is faster (after optimization, the speed of the two is almost the same now), but the probability of each result does not seem equal (note that various results also appear here):

Create a matrix with zeros. Then, you need to take randomly N row numbers, without repetition, and randomly N column numbers, without repetition. You can use random.sample for this. Then put 1 on the row/column positions.
import numpy as np
from random import sample
N = int(input("Enter the number of row and col:"))
my_matrix = np.zeros((N,N), dtype='int8')
rows = sample(range(N), N)
cols = sample(range(N), N)
points = zip(rows, cols)
for x, y in points:
my_matrix[x, y] = 1
print(my_matrix)
If you want a symmetrical matrix: In case N is even, I would take N random numbers out of N, half of them as x, half as y; and on both positions (x,y) and (y,x) put a 1. If N is uneven, an additional 1 needs to be put on a random position on the diagonal.
import numpy as np
from random import sample, randint
N = int(input("Enter the number of row and col:"))
even = N%2 == 0
my_matrix = np.zeros((N,N), dtype='int8')
N_range = list(range(N))
if not even:
diagonal = randint(0, N-1)
N_range.remove(diagonal)
my_matrix[diagonal, diagonal] = 1
N = N - 1
rowcol = sample(N_range, N)
rows = rowcol[:N//2]
cols = rowcol[N//2:]
for x, y in zip(rows,cols):
my_matrix[x, y] = 1
my_matrix[y, x] = 1
Here is a better version. Take the first free row, get a random free column, put a 1 on (row,col) and (col,row). Remove the used col/row. Repeat until all numbers 0-(N-1) are used.
import numpy as np
import random
N = int(input("Enter the number of row and col:"))
my_matrix=np.zeros((N,N))
not_used_number = list(range(N))
while len(not_used_number) != 0:
current_row = not_used_number[0]
random_col = random.choice(not_used_number)
my_matrix[current_row, random_col] = 1
my_matrix[random_col, current_row] = 1
not_used_number.remove(current_row)
if current_row != random_col:
not_used_number.remove(random_col)

Vectorizing Tensor Products from Python to Matlab

I am in the process of converting some code from Python into Matlab. I have code working that produces the same results, but I am wondering if there may be a way to vectorize some of my for loops in the Matlab code as it take a long time to run. X in an Nxd matrix, diff is an NxNxd tensor, kxy is an NxN matrix, gradK is an NxNx2 tensor, and sumkxy, dxkxy, and obj are all Nxd matrices.
Here is the original Python Code:
diff = x[:, None, :] - x[None, :, :] # D_{ij, s}
kxy = np.exp(-np.sum(diff ** 2, axis=-1) / (2 * h ** 2)) / np.power(np.pi * 2.0 * h * h, d / 2) # -1 last dimension K_{ij]
gradK = -diff * kxy[:, :, None] / h ** 2 # N * N * 2
sumkxy = np.sum(kxy, axis=1)
dxkxy = np.sum(gradK, axis=1) # N * 2 sum_{i} d_i K_{ij, s}
obj = np.sum(gradK / sumkxy[None, :, None], axis=1) # N * 2
and here is my initial Matlab Code with all the for loops:
diff = zeros([n,n,d]);
for i = 1:n
for j = 1:n
for k = 1:d
diff(i,j,k) = x(i,k) - x(j,k);
end
end
end
kxy = exp(-sum(dif.^2, 3)/(2*h^2))/((2*pi*h^2)^(d/2));
sumkxy = sum(kxy,2);
gradK = zeros([n,n,d]);
for i = 1:n
for j = 1:n
for k = 1:d
gradK(i,j,k) = -diff(i,j,k)*kxy(i, j)/h^2;
end
end
end
dxkxy = squeeze(sum(gradK,2));
a = zeros([n,n,d]);
for i =1:n
for j = 1:n
for k = 1:d
a(i,j,k) = gradK(i,j,k)/sumkxy(i);
end
end
end
obj = squeeze(sum(a, 2));
I know a way a faster way to calculate the kxy term is to use the following code:
XY = x*x';
x2= sum(x.^2, 2);
X2e = repmat(x2, 1, n);
H = (X2e + X2e' - 2*XY); % calculate pairwise distance
Kxy = exp(-H/(2*h^2))/((2*pi*h*h)^(d/2));
But then I struggle on a way to then calculate gradK efficiently without diff. Any help or suggestions would be greatly appreciated!

If your goal is computation of obj you don't need even to compute gradK and a:
sx = sum(x.^2, 2);
H = sx - 2*x*x.' + sx.';
kxy = exp(-H/(2*h^2))/((2*pi*h^2)^(d/2));
kh = kxy / h^2;
sumkxy = sum(kxy, 2);
khs = kh ./ sumkxy;
obj = khs * x - sum(khs, 2) .* x;
gradK and dif can be computed this way:
dif = reshape(x, n, 1, d) - reshape(x, 1, n, d);
gradK = -dif .* (kxy / h^2);.

I like to try approaching problems like these by breaking it down into "subcomponents" with some bogus data that will execute quickly and that you can use to test the code functionality. The first subcomponent you might start with is your first nested loop calculating diff:
n = 100;
d = 50;
x = round(100*rand(n,d));
tic
diff = zeros([n,n,d]);
for i = 1:n
for j = 1:n
for k = 1:d
diff(i,j,k) = x(i,k) - x(j,k);
end
end
end
toc
First, consider the innermost loop on its own:
...
for k = 1:d
diff(i,j,k) = x(i,k) - x(j,k);
end
...
Looking at this loop (at least for me!) simplifies things greatly. To vectorize just this "subcomponent" we could write something like:
diff(i,j,:) = x(i,:) - x(j,:);
Now that the low hanging fruit is out of the way, lets consider the next layer of loop. Does doing the same trick as before work?
diff(i,:,:) = x(i,:) - x; % where x(:,:) can just be written as x.
If you aren't sure, you can check this by running both the nested loop version and the one above with the same (emphasis on same) bogus data and checking if they are equal using isequal(). To cut to the chase, it should come out the same and now your original loop is down to:
tic
diff = zeros([n,n,d]);
for i = 1:n
diff(i,:,:) = x(i,:) - x;
end
toc
For this final bit, you can exploit matlab's matrix/array reshaping/permuting functions. Look up the documentation for reshape() or permute() for more details. In brief, if you reshape or change the order of the dimensions of one copy of x from Nxd to 1xNxd, subtracting x from another, regularly sized matrix will perform the operations elementwise in matlab. So for example:
diff = permute(x,[1,3,2]) - permute(x,[3,1,2]); % this is Nx1xd - 1xNxd
should effectively compute the tensor difference you were looking for in the first loop!
I can expand this answer to show how the other loops might be worked out if you want, but give the other ones a try first with this same logic. Hopefully, you can keep diff and then calculate kxy much faster. Without knowing how big your original matrices are, I can't say how much speedup you should expect though.
Update:
I should add, to ensure that you are doing elementwise multiplication, division and transpose operations make sure to add a '.' before each command. E.g.
gradK(i,j,:) = -diff(i,j,:).*kxy(i, j)/h^2;
For more information, look up elementwise operations in Matlab

Multigrid Poisson Solver

I am trying to make my own CFD solver and one of the most computationally expensive parts is solving for the pressure term. One way to solve Poisson differential equations faster is by using a multigrid method. The basic recursive algorithm for this is:
function phi = V_Cycle(phi,f,h)
% Recursive V-Cycle Multigrid for solving the Poisson equation (\nabla^2 phi = f) on a uniform grid of spacing h
% Pre-Smoothing
phi = smoothing(phi,f,h);
% Compute Residual Errors
r = residual(phi,f,h);
% Restriction
rhs = restriction(r);
eps = zeros(size(rhs));
% stop recursion at smallest grid size, otherwise continue recursion
if smallest_grid_size_is_achieved
eps = smoothing(eps,rhs,2*h);
else
eps = V_Cycle(eps,rhs,2*h);
end
% Prolongation and Correction
phi = phi + prolongation(eps);
% Post-Smoothing
phi = smoothing(phi,f,h);
end
I've attempted to implement this algorithm myself (also at the end of this question) however it is very slow and doesn't give good results so evidently it is doing something wrong. I've been trying to find why for too long and I think it's just worthwhile seeing if anyone can help me.
If I use a grid size of 2^5 by 2^5 points, then it can solve it and give reasonable results. However, as soon as I go above this it takes exponentially longer to solve and basically get stuck at some level of inaccuracy, no matter how many V-Loops are performed. at 2^7 by 2^7 points, the code takes way too long to be useful.
I think my main issue is that my implementation of a jacobian iteration is using linear algebra to calculate the update at each step. This should, in general, be fast however, the update matrix A is an n*m sized matrix, and calculating the dot product of a 2^7 * 2^7 sized matrix is expensive. As most of the cells are just zeros, should I calculate the result using a different method?
if anyone has any experience in multigrid methods, I would appreciate any advice!
Thanks
my code:
# -*- coding: utf-8 -*-
"""
Created on Tue Dec 29 16:24:16 2020
#author: mclea
"""
import numpy as np
import matplotlib.pyplot as plt
from scipy.signal import convolve2d
from mpl_toolkits.mplot3d import Axes3D
from scipy.interpolate import griddata
from matplotlib import cm
def restrict(A):
"""
Creates a new grid of points which is half the size of the original
grid in each dimension.
"""
n = A.shape[0]
m = A.shape[1]
new_n = int((n-2)/2+2)
new_m = int((m-2)/2+2)
new_array = np.zeros((new_n, new_m))
for i in range(1, new_n-1):
for j in range(1, new_m-1):
ii = int((i-1)*2)+1
jj = int((j-1)*2)+1
# print(i, j, ii, jj)
new_array[i,j] = np.average(A[ii:ii+2, jj:jj+2])
new_array = set_BC(new_array)
return new_array
def interpolate_array(A):
"""
Creates a grid of points which is double the size of the original
grid in each dimension. Uses linear interpolation between grid points.
"""
n = A.shape[0]
m = A.shape[1]
new_n = int((n-2)*2 + 2)
new_m = int((m-2)*2 + 2)
new_array = np.zeros((new_n, new_m))
i = (np.indices(A.shape)[0]/(A.shape[0]-1)).flatten()
j = (np.indices(A.shape)[1]/(A.shape[1]-1)).flatten()
A = A.flatten()
new_i = np.linspace(0, 1, new_n)
new_j = np.linspace(0, 1, new_m)
new_ii, new_jj = np.meshgrid(new_i, new_j)
new_array = griddata((i, j), A, (new_jj, new_ii), method="linear")
return new_array
def adjacency_matrix(rows, cols):
"""
Creates the adjacency matrix for an n by m shaped grid
"""
n = rows*cols
M = np.zeros((n,n))
for r in range(rows):
for c in range(cols):
i = r*cols + c
# Two inner diagonals
if c > 0: M[i-1,i] = M[i,i-1] = 1
# Two outer diagonals
if r > 0: M[i-cols,i] = M[i,i-cols] = 1
return M
def create_differences_matrix(rows, cols):
"""
Creates the central differences matrix A for an n by m shaped grid
"""
n = rows*cols
M = np.zeros((n,n))
for r in range(rows):
for c in range(cols):
i = r*cols + c
# Two inner diagonals
if c > 0: M[i-1,i] = M[i,i-1] = -1
# Two outer diagonals
if r > 0: M[i-cols,i] = M[i,i-cols] = -1
np.fill_diagonal(M, 4)
return M
def set_BC(A):
"""
Sets the boundary conditions of the field
"""
A[:, 0] = A[:, 1]
A[:, -1] = A[:, -2]
A[0, :] = A[1, :]
A[-1, :] = A[-2, :]
return A
def create_A(n,m):
"""
Creates all the components required for the jacobian update function
for an n by m shaped grid
"""
LaddU = adjacency_matrix(n,m)
A = create_differences_matrix(n,m)
invD = np.zeros((n*m, n*m))
np.fill_diagonal(invD, 1/4)
return A, LaddU, invD
def calc_RJ(rows, cols):
"""
Calculates the jacobian update matrix Rj for an n by m shaped grid
"""
n = int(rows*cols)
M = np.zeros((n,n))
for r in range(rows):
for c in range(cols):
i = r*cols + c
# Two inner diagonals
if c > 0: M[i-1,i] = M[i,i-1] = 0.25
# Two outer diagonals
if r > 0: M[i-cols,i] = M[i,i-cols] = 0.25
return M
def jacobi_update(v, f, nsteps=1, max_err=1e-3):
"""
Uses a jacobian update matrix to solve nabla(v) = f
"""
f_inner = f[1:-1, 1:-1].flatten()
n = v.shape[0]
m = v.shape[1]
A, LaddU, invD = create_A(n-2, m-2)
Rj = calc_RJ(n-2,m-2)
update=True
step = 0
while update:
v_old = v.copy()
step += 1
vt = v_old[1:-1, 1:-1].flatten()
vt = np.dot(Rj, vt) + np.dot(invD, f_inner)
v[1:-1, 1:-1] = vt.reshape((n-2),(m-2))
err = v - v_old
if step == nsteps or np.abs(err).max()<max_err:
update=False
return v, (step, np.abs(err).max())
def MGV(f, v):
"""
Solves for nabla(v) = f using a multigrid method
"""
# global A, r
n = v.shape[0]
m = v.shape[1]
# If on the smallest grid size, compute the exact solution
if n <= 6 or m <=6:
v, info = jacobi_update(v, f, nsteps=1000)
return v
else:
# smoothing
v, info = jacobi_update(v, f, nsteps=10, max_err=1e-1)
A = create_A(n, m)[0]
# calculate residual
r = np.dot(A, v.flatten()) - f.flatten()
r = r.reshape(n,m)
# downsample resitdual error
r = restrict(r)
zero_array = np.zeros(r.shape)
# interploate the correction computed on a corser grid
d = interpolate_array(MGV(r, zero_array))
# Add prolongated corser grid solution onto the finer grid
v = v - d
v, info = jacobi_update(v, f, nsteps=10, max_err=1e-6)
return v
sigma = 0
# Setting up the grid
k = 6
n = 2**k+2
m = 2**(k)+2
hx = 1/n
hy = 1/m
L = 1
H = 1
x = np.linspace(0, L, n)
y = np.linspace(0, H, m)
XX, YY = np.meshgrid(x, y)
# Setting up the initial conditions
f = np.ones((n,m))
v = np.zeros((n,m))
# How many V cyles to perform
err = 1
n_cycles = 10
loop = True
cycle = 0
# Perform V cycles until converged or reached the maximum
# number of cycles
while loop:
cycle += 1
v_new = MGV(f, v)
if np.abs(v - v_new).max() < err:
loop = False
if cycle == n_cycles:
loop = False
v = v_new
print("Number of cycles " + str(cycle))
plt.contourf(v)

I realize that I'm not answering your question directly, but I do note that you have quite a few loops that will contribute some overhead cost. When optimizing code, I have found the following thread useful - particularly the line profiler thread. This way you can focus in on "high time cost" lines and then start to ask more specific questions regarding opportunities to optimize.
How do I get time of a Python program's execution?

How to create multiple files using a nested loop

I hope everybody is well, safe, and healthy during this time.
I'm currently working on a python assignment. Using the following code, I need to run through each value of beta, and for each value of beta, I need to run through each value of reduction)factor to run through the following steps.
Then for each iteration of reductiom_factor for each value of beta I need to save the data in a file with the titles in listofsolutions. Two things I'm not sure how to do: is how to export data with a file with a name in list of solutions (which are in order for each value of beta and reduction_factor), and how to use np.savez to save the data to the files I have just created.
This is my amended code
b = [1/4, 0]
beta = np.asarray(b)
gamma = 0.5
listofsolutions = ['Q2_AA_0.1','Q2_AA_0.9','Q2_AA_0.99', 'Q2_AA_1', 'Q2_AA_1.1', 'Q2_AA_2','Q2_CD_0.1','Q2_CD_0.9','Q2_CD_0.99', 'Q2_CD_1', 'Q2_CD_1.1', 'Q2_CD_2']
consistent = True # use a consistent mass matrix
for bb in itertools.zip_longest(beta):
c = np.sqrt(E / rho_tilde) # wave speed
T = 0.016 # total time
# compute the critical time-step
# note: uncondionally stable AA scheme will return 1.0
delta_t_crit = fe.get_delta_t_crit(le = le, gamma = gamma, beta = bb, consistent = consistent, c = c)
# actual times-step used is a factor of the critical time-step
reduction_factor = [0.1, 0.9, 0.99, 1, 1.1, 2]
for rf in reduction_factor:
delta_t = rf * delta_t_crit
# selected output data is stored to a file with the name given below
# use this to save the results from the different runs
# change the name to match the data you want to store
for i in b and r in reduction_factor:
outfile[i] = listofsolutions[i]
n_t_steps = int(np.ceil(T / delta_t)); # number of time step
# initialise the time domain, K and M
t = np.linspace(0, T, n_t_steps)
K = np.zeros((n_dof, n_dof))
M = np.zeros((n_dof, n_dof))
# assemble K and M
for ee in range(n_el):
dof_index = fe.get_dof_index(ee)
M[np.ix_(dof_index, dof_index)] += fe.get_Me(le = le, Ae = Ae, rho_tilde_e = rho_tilde, consistent = consistent)
# damping matrix
C = np.zeros((n_dof, n_dof))
# assemble the system matrix A
A_matrix = M + (gamma * delta_t) * C + (beta * delta_t**2)*K
# define the free dofs
free_dof = np.arange(1,n_dof)
# initial conditions
d = np.zeros((n_dof, 1))
v = np.zeros((n_dof, 1))
F = np.zeros((n_dof, 1))
# compute the initial acceleration
a = np.linalg.solve(M, F - C.dot(v) - K.dot(d))
# store the history data
# rows -> each node
# columns -> each time step including initial at 0
d_his = np.zeros((n_dof, n_t_steps))
v_his = np.zeros((n_dof, n_t_steps))
a_his = np.zeros((n_dof, n_t_steps))
d_his[:,0] = d[:,0]
v_his[:,0] = v[:,0]
a_his[:,0] = a[:,0]
# loop over the time domain and solve the problem at each step
for n in range(1,n_t_steps):
# data at beginning of the time-step n
a_n = a
v_n = v
d_n = d
# applied loading
t_current = n * delta_t # current time
if t_current<0.001:
F[-1] = A_bar * Ae * np.sin(1000 * t_current * np.pi)
else:
F[-1]=0.
# define predictors
d_tilde = d_n + delta_t*v_n + ((delta_t**2)/2.) * (1 - 2*beta) * a_n
v_tilde = v_n + (1 - gamma) * delta_t * a_n
# assemble the right-hand side from the known data
R = F - C.dot(v_tilde) - K.dot(d_tilde)
# impose essential boundary condition and solve A a = RHS
A_free = A_matrix[np.ix_(free_dof, free_dof)]
R_free = R[np.ix_(free_dof)]
# solve for the accelerations at the free nodes
a_free = np.linalg.solve(A_free, R_free)
a = np.zeros((n_dof, 1))
a[1:] = a_free
# update displacement and vecloity predictors using the acceleration
d = d_tilde + (beta * delta_t**2) * a
v = v_tilde + (gamma * delta_t) * a
# store solutions
d_his[:,n] = d[:,0]
v_his[:,n] = v[:,0]
a_his[:,n] = a[:,0]
# post-processing
mid_node = int(np.ceil(n_dof / 2)) # mid node
# compute the stress in each element
# assuming constant E
stress = (E / le) * np.diff(d_his, axis=0)
# here we save the stress data for the middle element
np.savez(outfile, t, stress[mid_node,:]
I'm not sure how to specify to the program to save the result for each value of reduction_factor within gamma. In addition, for the last line of the code, I'm not sure how to save each iteration to the list of file names I have created.
I tried to do this using the statement"
for i in b` and r in reduction_factor:
outfile[i] = listofsolutions[i]"`
but I don't think this makes sense.
I am a total newbie at python so I am not familiar how to save files within nested loops.. I apologize if any of my questions are rudimentary.

for i in b and r in reduction_factor:
outfile[i] = listofsolutions[i]
It is not correct. Possible solution:
for i in b: # variable 'i' will take every value of list b
for r in reduction_factor: # 'r' will iterate through reduction_factor
outfile[i] = listofsolutions[i] # outfile must be declared until it
Still, there is no logic. In that way, you can create a dictionary.
If you really want to create the file - read about "with open(filename, 'w'):" construction and nested loops or list comprehension.

What is the efficient way to perform 2D matrix filtering in python?

I am trying to implement a matrix filtering in Python, and so far the implementation appears to be very slow and inefficient. I wonder if there is an efficient way of performing such filtering.
Provided a large matrix A and a filtering matrix M, the function should return a "remixed" matrix R which is obtained by multiplying each element (i,j) of A by M, then the result is superposed/inserted into R at position (i,j). Please find below the code that is expected to do this.
The example below takes about 68 sec (!) at my computer, which seems very inefficient.
I would be very grateful if you could recommend the way to speed-up this function. Many thanks in advance!
import numpy as np
import time
nx = ny = 1500
n_mix = 50
# matrix to be filtered
A = np.random.random_sample( (nx, ny) )
# filter to be applied to each point:
M = np.random.random_sample( (2*n_mix+1, 2*n_mix+1) )
# the result is stored in "remix":
remix = np.zeros_like(A)
start = time.time()
for i in range(n_mix, nx-n_mix):
for j in range(n_mix, ny-n_mix):
remix[i - n_mix:i + n_mix + 1, j - n_mix:j + n_mix + 1 ] += M * A[i,j]
print remix
duration = time.time() - start
print(round(duration))
UPDATE
In fact the ndimage package in scipy has the general convolution function that does the job. I post below the 3 variants of doing the filtering, with respected times. The fastest is with ndimage.convolution (24 seconds vs. 56 and 68 by other methods). However, it still seems rather slow...
import numpy as np
from scipy import ndimage
import time
import sys
def remix_function(A, M):
n = (np.shape(M)[0]-1)/2
R = np.zeros_like(A)
for k in range(-n, n+1):
for l in range(-n, n+1):
# Ak = np.roll(A, -k, axis = 0)
# Akl = np.roll(Ak, -l, axis = 1)
R += np.roll(A, (-k,-l), axis = (0,1) ) * M[n-k, n-l]
return R
if __name__ == '__main__':
np.set_printoptions(precision=2)
nx = ny = 1500
n_mix = 50
nb = 2*n_mix+1
# matrix to be filtered
A = np.random.random_sample( (nx, ny) )
# filter to be applied to each point:
M = np.random.random_sample( (nb, nb) )
# the result is stored in "remix":
remix1 = np.zeros_like(A)
remix2 = np.zeros_like(A)
remix3 = np.zeros_like(A)
#------------------------------------------------------------------------------
# var 1
#------------------------------------------------------------------------------
start = time.time()
remix1 = remix_function(A, M)
duration = time.time() - start
print('time for var1 =', round(duration))
#------------------------------------------------------------------------------
# var 2
#------------------------------------------------------------------------------
start = time.time()
for i in range(n_mix, nx-n_mix):
for j in range(n_mix, ny-n_mix):
remix2[i - n_mix:i + n_mix + 1, j - n_mix:j + n_mix + 1 ] += M * A[i,j]
duration = time.time() - start
print('time for var2 =', round(duration))
#------------------------------------------------------------------------------
# var 3
#------------------------------------------------------------------------------
start = time.time()
remix3 = ndimage.convolve(A, M)
duration = time.time() - start
print('time for var3 (convolution) =', round(duration))

I can't comment on posts yet, but your double for loop is the problem. Have you tried defining a function and then using np.vectorize?

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Is it possible to optimize this dynamic programming code? - python

Related

How to put number of a particular element in a particular row and column constraints in a matrix?

Vectorizing Tensor Products from Python to Matlab

Multigrid Poisson Solver

How to create multiple files using a nested loop

What is the efficient way to perform 2D matrix filtering in python?

Categories

Resources