How to make my python integration faster? - python

Hi i want to integrate a function from 0 to several different upper limits (around 1000). I have written a piece of code to do this using a for loop and appending each value to an empty array. However i realise i could make the code faster by doing smaller integrals and then adding the previous integral result to the one just calculated. So i would be doing the same number of integrals, but over a smaller interval, then just adding the previous integral to get the integral from 0 to that upper limit. Heres my code at the moment:
import numpy as np #importing all relevant modules and functions
from scipy.integrate import quad
import pylab as plt
import datetime
t0=datetime.datetime.now() #initial time
num=np.linspace(0,10,num=1000) #setting up array of values for t
Lt=np.array([]) #empty array that values for L(t) are appended to
def L(t): #defining function for L
return np.cos(2*np.pi*t)
for g in num: #setting up for loop to do integrals for L at the different values for t
Lval,x=quad(L,0,g) #using the quad function to get the values for L. quad takes the function, where to start the integral from, where to end the integration
Lv=np.append(Lv,[Lval]) #appending the different values for L at different values for t
What changes do I need to make to do the optimisation technique I've suggested?

Basically, we need to keep track of the previous values of Lval and g. 0 is a good initial value for both, since we want to start by adding 0 to the first integral, and 0 is the start of the interval. You can replace your for loop with this:
last, lastG = 0, 0
for g in num:
Lval,x = quad(L, lastG, g)
last, lastG = last + Lval, g
Lv=np.append(Lv,[last])
In my testing, this was noticeably faster.
As #askewchan points out in the comments, this is even faster:
Lv = []
last, lastG = 0, 0
for g in num:
Lval,x = quad(L, lastG, g)
last, lastG = last + Lval, g
Lv.append(last)
Lv = np.array(Lv)

Using this function:
scipy.integrate.cumtrapz
I was able to reduce time to below machine precision (very small).
The function does exactly what you are asking for in a highly efficient manner. See docs for more info: https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.integrate.cumtrapz.html
The following code, which reproduces your version first and then mine:
# Module Declarations
import numpy as np
from scipy.integrate import quad
from scipy.integrate import cumtrapz
import time
# Initialise Time Array
num=np.linspace(0,10,num=1000)
# Your Method
t0 = time.time()
Lv=np.array([])
def L(t):
return np.cos(2*np.pi*t)
for g in num:
Lval,x=quad(L,0,g)
Lv=np.append(Lv,[Lval])
t1 = time.time()
print(t1-t0)
# My Method
t2 = time.time()
functionValues = L(num)
Lv_Version2 = cumtrapz(functionValues, num, initial=0)
t3 = time.time()
print(t3-t2)
Which consistently yields:
t1-t0 = O(0.1) seconds
t3-t2 = 0 seconds

Related

How to make a graph between order of the matrix and the time taken to multiply the two matrices?

import numpy as np
from time import time
import matplotlib.pyplot as plt
np.random.seed(27)
mysetup = "from math import sqrt"
begin=time()
i=int(input("Number of rows in first matrix"))
k=int(input("Number of column in first and rows in second matrix"))
j=int(input("Number of columns in second matrix"))
A = np.random.randint(1,10,size = (i,k))
B = np.random.randint(1,10,size = (k,j))
def multiply_matrix(A,B):
global C
if A.shape[1]==B.shape[0]:
C=np.zeros((A.shape[0],B.shape[1]),dtype=int)
for row in range(i):
for col in range(j):
for elt in range(0,len(B)):
C[row,col] += A[row,elt]*B[elt,col]
return C
else:
return "Cannot multiply A and B"
print(f"Matrix A:\n {A}\n")
print(f"Matrix B:\n {B}\n")
D=print(multiply_matrix(A, B))
end=time()
t=print(end-begin)
x=[0,100,10]
y=[100,100,1000]
plt.plot(x,y)
plt.xlabel('Time taken for the program to run')
plt.ylabel('Order of the matrix multiplication')
plt.show()
In the program, I have generated random elements for the matrices to be multiplied.Basically I am trying to compute the time it takes to multiply two matrices.The i,j and k will be considered as the order used for the matrix.As we cannot multiply matrices where number of columns of the first is not equal to the number of the rows in the second, I have already given them the variable 'k'.
Initially I considered to increment the order of the matrix using for loop but wasn't able to do so. I want the graph to display the time it took to multiply the matrices on the x axis and the order of the resultant matrix on the y axis.
There is a problem in the logic I applied but I am not able to find out how to do this problem as I am a beginner in programming
I was expecting to get the result as Y axis having a scale ranging from 0 to 100 with a difference of 10 and x axis with a scale of 100 to 1000 with a difference of 100.
The thousandth entity on the x axis will correspond to the time it took to compute the multiplication of two matrices with numbers of rows and columns as 1000.
Suppose the time it took to compute this was 200seconds. So the graph should be showing the point(1000,200).
Some problematic points I'd like to address -
You're starting the timer before the user chooses an input - which can differ, we want to be as precise as possible, thus we need to only calculate how much time it takes for the multiply_matrix function to run.
Because you're taking an input - it means that each run you will get one result, and one result is only a single point - not a full graph, so we need to get rid of the user input and generate our own.
Moreover to point #2 - we are not interested in giving "one shot" for each matrix order - that means that when we want to test how much time it takes to multiply two matrices of order 300 (for example) - we need to do it N times and take the average in order to be more precise, not to mention we are generating random numbers, and it is possible that some random generated matrices will be easier to compute than other... although taking the average over N tests is not 100% accurate - it does help.
You don't need to set C as a global variable as it can be a local variable of the function multiply_matrix that we anyways return. Also this is not the usage of globals as even with the global C - it will be undefined in the module level.
This is not a must, but it can improve a little bit your program - use time.perf_counter() as it uses the clock with the highest (available) resolution to measure a short duration, and it avoids precision loss by the float type.
You need to change the axes because we want to see how the time is affected by the order of the matrices, not the opposite! (so our X axis is now the order and the Y is the average time it took to multiply them)
Those fixes translate to this code:
Calculating how much it takes for multiply_matrix only.
begin = time.perf_counter()
C = multiply_matrix(A, B)
end = time.perf_counter()
2+3. Generating our own data, looping from order 1 to order maximum_order, taking 50 tests for each order:
maximum_order = 50
tests_number_for_each_order = 50
def generate_matrices_to_graph():
matrix_orders = [] # our X
multiply_average_time = [] # our Y
for order in range(1, maximum_order):
print(order)
times_for_each_order = []
for _ in range(tests_amount_for_each_order):
# generating random square matrices of size order.
A = np.random.randint(1, 10, size=(order, order))
B = np.random.randint(1, 10, size=(order, order))
# getting the time it took to compute
begin = time.perf_counter()
multiply_matrix(A, B)
end = time.perf_counter()
# adding it to the times list
times_for_each_order.append(end - begin)
# adding the data about the order and the average time it took to compute
matrix_orders.append(order)
multiply_average_time.append(sum(times_for_each_order) / tests_amount_for_each_order) # average
return matrix_orders, multiply_average_time
Minor changes to multiply_matrix as we don't need i, j, k from the user:
def multiply_matrix(A, B):
matrix_order = A.shape[1]
C = np.zeros((matrix_order, matrix_order), dtype=int)
for row in range(matrix_order):
for col in range(matrix_order):
for elt in range(0, len(B)):
C[row, col] += A[row, elt] * B[elt, col]
return C
and finally call generate_matrices_to_graph
# calling the generate_data_and_compute function
plt.plot(*generate_matrices_to_graph())
plt.xlabel('Matrix order')
plt.ylabel('Time [in seconds]')
plt.show()
Some outputs:
We can see that when our tests_number_for_each_order is small, the graph loses precision and crisp.
Going from order 1-40 with 1 test for each order:
Going from order 1-40 with 30 tests for each order:
Going from order 1-40 with 80 tests for each order:
I love this kind of questions:
import numpy as np
from time import time
import matplotlib.pyplot as plt
np.random.seed(27)
dim = []
times = []
for i in range(1,10001,10):
A = np.random.randint(1,10,size=(1,i))
B = np.random.randint(1,10,size=(i,1))
begin = time()
C = A*B
times.append(time()-begin)
dim.append(i)
plt.plot(times,dim)
This is a simplified test in which I tested 1 dimension matrices, (1,1)(1,1), (1,10)(10,1), (1,20)(20,1) and so on...
But you can make a double iteration to change also the "outer" dimension of the matrices and see how this affect the computational time

How do i solve this equation for m using SciPy root function and how do i write it

```
import numpy as np
K = 1.38e-23
z = 8
J = 1.8e-21
T = 1
m = 0
```
these are just constants
m = np.tanh((z*J*m)/(K*T))
this is the equation
but I need to find it each m value for each T value so I'm not sure if a nested loop would be better where I loop through T and m because I've tried and is doesn't work proper I'm just not sure what to do so any help would be great. also T is in range 1 - 1501
I am not still sure what you really need.
If you do some optimization to find some min or max, you should have a look again might you provide less information to us than needed (for example you say m=0 but that gives answer 0 for any T. Might you want to have range of m and T. It's then another story.
Without scipy if you willing to calculate just a result of a function of two parameters, m and T, you can use linspace for example to set ranges and steps.
import numpy as np
import matplotlib.pyplot as plt
def func(T, m):
z = 8
K = 1.38e-23
J = 1.8e-21
return np.tanh((z*J*m)/(K*T))
xaxis = np.linspace(1, 1501, 1500)
yaxis = np.linspace(0, 10, 10)
result = func(xaxis[:,None], yaxis[None,:])
plt.plot(result)
And you get smth as below, few curves. But I set m in range, might you need another values there. Up to you. Check and go.

Build a coupled map lattice using 2D array

So I'm trying to build a coupled map lattice on my computer.
A coupled map lattice (CML) is given by this eq'n:
where, the function f(Xn) is a logistic map :
with x value from 0-1, and r=4 for this CML.
Note: 'n' can be thought of as time, and 'i' as space
I have spent a lot of time understanding the iterations and i came up with a code as below, however i'm not sure if this is the correct code to iterate this equation.
Note: I have used 2d numpy arrays, where rows are 'n' and columns are 'i' as obvious from the code.
So basically, I want to develop a code to simulate this equation, and here is my take on that
Don't jump to the code directly, you won't understand what's happening without bothering to look at the equations first.
import numpy as np
import matplotlib.pyplot as plt
'''The 4 definitions created below are actually similar and only vary in their indexings. These 4
have been created only because of the if conditions I have put in the for loop '''
def logInit(r,x):
y[n,0]=r*x[n,0]*(1-x[n,0])
return y[n,0]
def logPresent(r,x):
y[n,i]=r*x[n,i]*(1-x[n,i])
return y[n,i]
def logLast(r,x):
y[n,L-1]=r*x[n,L-1]*(1-x[n,L-1])
return y[n,L-1]
def logNext(r,x):
y[n,i+1]=r*x[n,i+1]*(1-x[n,i+1])
return y[n,i+1]
def logPrev(r,x):
y[n,i-1]=r*x[n,i-1]*(1-x[n,i-1])
return y[n,i-1]
# 2d array with 4 row, 3 col. I created this because I want to store the evaluated values of log
function into this y[n,i] array
y=np.ones(12).reshape(4,3)
# creating an array of random numbers between 0-1 with 4 rows 3 columns
np.random.seed(0)
x=np.random.random((4,3))
L=3
r=4
eps=0.5
for n in range(3):
for i in range(L):
if i==0:
x[n+1,i]=(1-eps)*logPresent(r,x) + 0.5*eps*(logLast(r,x)+logNext(r,x))
elif i==L-1:
x[n+1,i]=(1-eps)*logPresent(r,x) + 0.5*eps*(logPrev(r,x) + logInit(r,x))
elif i > 0 and i < L - 1:
x[n+1,i]=(1-eps)*logPresent(r,x) + 0.5*eps*(logPrev(r,x) +logNext(r,x))
print(x)
This does give an output. Here it is:
[[0.5488135 0.71518937 0.60276338]
[0.94538775 0.82547604 0.64589411]
[0.43758721 0.891773 0.96366276]
[0.38344152 0.79172504 0.52889492]]
[[0.5488135 0.71518937 0.60276338]
[0.94538775 0.82547604 0.92306303]
[0.2449672 0.49731638 0.96366276]
[0.38344152 0.79172504 0.52889492]]
[[0.5488135 0.71518937 0.60276338]
[0.94538775 0.82547604 0.92306303]
[0.2449672 0.49731638 0.29789622]
[0.75613708 0.93368134 0.52889492]]
But I'm very sure this is not what I'm looking for.
If you can please figure out a correct way to iterate and loop the CML equation with code ? Suggest me the changes I have to make. Thank you very much!!
You'll have to think about the iterations and looping to be made to simulate this equation. It might be tedious, but that's the only way you can suggest me some changes in my code.
Your calculations seem fine to me. You could improve the speed by using vectorization along the space dimension and by reusing your intermediate results y. I restructured your program a little, but in essence it does the same thing as before. For me the results look plausible. The image shows the random initial vector in the first row and as the time goes on (top to bottom) the coupling comes in to play and little islands and patterns form.
import numpy as np
import matplotlib.pyplot as plt
L = 128 # grid size
N = 128 # time steps
r = 4
eps = 0.5
# Create random values for the initial time step
np.random.seed(0)
x = np.zeros((N+1, L))
x[0, :] = np.random.random(L)
# Create a helper matrix to save and reuse part of the calculations
y = np.zeros((N, L))
# Indices for previous, present, next position for every point on the grid
idx_present = np.arange(L) # 0, 1, ..., L-2, L-1
idx_next = (idx_present + 1) % L # 1, 2, ..., L-1, 0
idx_prev = (idx_present - 1) % L # L-1, 0, ..., L-3, L-2
def log_vector(rr, xx):
return rr * xx * (1 - xx)
# Loop over the time steps
for n in range(N):
# Compute y once for the whole time step and reuse it
# to build the next time step with coupling the neighbours
y[n, :] = log_vector(rr=r, xx=x[n, :])
x[n+1, :] = (1-eps)*y[n,idx_present] + 0.5*eps*(y[n,idx_prev]+y[n,idx_next])
# Plot the results
plt.imshow(x)

speed up finite difference model

I have a complex finite difference model which is written in python using the same general structure as the below example code. It has two for loops one for each iteration and then within each iteration a loop for each position along the x array. Currently the code takes two long to run (probably due to the for loops). Is there a simple technique to use numpy to remove the second for loop?
Below is a simple example of the general structure I have used.
import numpy as np
def f(x,dt, i):
xn = (x[i-1]-x[i+1])/dt # a simple finite difference function
return xn
x = np.linspace(1,10,10) #create initial conditions with x[0] and x[-1] boundaries
dt = 10 #time step
iterations = 100 # number of iterations
for j in range(iterations):
for i in range(1,9): #length of x minus the boundaries
x[i] = f(x, dt, i) #return new value for x[i]
Does anyone have any ideas or comments on how I could make this more efficient?
Thanks,
Robin
For starters, this little change to the structure improves efficiency by roughly 15%. I would not be surprised if this code can be further optimized but that will most likely be algorithmic inside the function, i.e. some way to simplify the array element operation. Using a generator may likely help, too.
import numpy as np
import time
time0 = time.time()
def fd(x, dt, n): # x is an array, n is the order of central diff
for i in range(len(x)-(n+1)):
x[i+1] = (x[i]-x[i+2])/dt # a simple finite difference function
return x
x = np.linspace(1, 10, 10) # create initial conditions with x[0] and x[-1] boundaries
dt = 10 # time step
iterations = 1000000 # number of iterations
for __ in range(iterations):
x = fd(x, dt, 1)
print(x)
print('time elapsed: ', time.time() - time0)

Code running since infinite time

Below is my code. It has been running since infinite time (almost a day). I am unable to figure out if it's because there are many loops or because there is come unending loop. Following is my code :
mat1 = np.zeros((1024,1024,360),dtype=np.int32)
k = 498
gamma = 0.00774267
R = 0.37
g = np.zeros(1024)
g[0:512] = np.linspace(0,1,512)
g[513:] = np.linspace(1,0,511)
pf = np.zeros((1024,1024,360))
pf1 = np.zeros((1024,1024,360))
for b in range(0,1023) :
for beta in range(0,359) :
for a in range(0,1023) :
pf[a,b,beta] = (R/(((R**2)+(a**2)+(b**2))**0.5))*mat[a,b,beta]
pf1[:,b,beta] = np.convolve(pf[:,b,beta],g,'same')
for x in range(0,1023) :
for y in range(0,1023) :
for z in range(0,359) :
for beta in range(0,359) :
a = R*((-x*0.005)*(sin(beta)) + (y*0.005)*(cos(beta)))/(R+(x*0.005)*(cos(beta))+(y*0.005)*(sin(beta)))
b = z*R/(R+(x*0.005)*(cos(beta))+(y*0.005)*(sin(beta)))
U = R+(x*0.005)*(cos(beta))+(y*0.005)*(sin(beta))
l = math.trunc(a)
m = math.trunc(b)
if (0<=l<1024 and 0<=m<1024) :
mat1[x,y,z] = mat[x,y,z] + (R**2/U**2)**pf1[l,m,beta]
import matplotlib.pyplot as plt
from skimage.transform import iradon
import matplotlib.cm as cm
from PIL import Image
I8 = (((mat1 - mat1.min()) / (mat1.max() - mat1.min())) * 255.9).astype(np.uint8)
img = Image.fromarray(I8)
img.save("M4.png")
im = Image.open("M4.png")
im.show()
Your code will run in finite time.
However, if you sprinkle in a few print statements to see where you are in the various loops, you can see why it will take so long. For instance, after the for y in range(0, 1023): line, add a print(y) line, you'll see it takes about 1 second between each printout, so that part of your code will take about 1023 x 1023 seconds, which is 12 days. You may want to look into modules like multiprocessing to parallelize some of the calculations, but even on a 32 core machine your code will still take around half a day to run.
There are several small optimizations you can do, I'm not sure entirely how much they will help. For one, you can calculate sin(beta) and cos(beta) once each in the inner loop, rather than 4 times each. You can calculate R**2 once globally, rather than every time inside the inner loop. You can calculate x*0.005 and y*0.005 less often, as well as a and l. You can split up the conditional involving l and m, and move the l conditional up above the z loop, thereby potentially avoiding that z loop sometimes.
Also, it seems weird that you're having beta range from 0 to 359, and then calculating its sin and cos values. Those functions expect arguments in radians, e.g. the sine of a right angle is not sin(90) but rather sin(math.pi/2).

Categories

Resources