Inverting tridiagonal matrix - python

I am having an equation
Ax=By
Where A and B are tridiagonal matrices. I want to calculate a matrix
C=inv (A).B
there are different x,s which will give different y,s hence calculation of C is handy.
Can someone please tell me a faster method to compute the inverse. I am using Python 3.5 and prefer if we use any method from numpy. If not possible I can use scipy or cython as second and third choice.
I have seen other similar questions but they do not fully match with my problem.
Thank you

There are many method to do it, anyway one of the simplest is the Tridiagonal matrix algorithm see the Wiki page. This algorithm work in O(n) time, there is a simple implementation in Numpy at the following Github link.
However, you may think to implement by yourself one of the known algorithm, for example something like a LU factorization

scipy.linalg.solve_banded is a wrapper for LAPACK which should in turn call MKL. It seems to run O(N). For a trivial example to show syntax
a = np.array([[1,2,0,0], [-1,2,1,0], [0,1,3,1], [0,0,1,2]])
x = np.array([1,2,3,4])
b = np.dot(a,x)
ab = np.empty((3,4))
ab[0,1:] = np.diag(a,1)
ab[1,:] = np.diag(a,0)
ab[2,:-1] = np.diag(a,-1)
y = solve_banded((1,1),ab,b)
print y

Related

Matrix Multiplication in Python 3

I am trying to create several functions for Linear Algebra and was completely stuck on matrix multiplication. Below is my working solution, but is there possibly a cleaner solution?
def matrix_multiplication(a, b):
# Transpose matrix b
bT = list(zip(*b))
# Multiply the two
return [[sum([a[ai][j] * bT[bTi][j] for j in range(len(a[ai]))]) for bTi in range(len(bT))] for ai in range(len(a))]
Firstof all, I would only recommend creating a linear algebra library from scratch, if you want to use it for learning purposes. Otherwise you should use numpy.linalg or something similar.
Assuming, you want to do this from scratch, I reccommend to go with object oriented programming approach. This would mean creating your own matrix class.
You can try something similar to this blog: https://towardsdatascience.com/how-to-build-a-matrix-module-from-scratch-a4f35ec28b56
If you are trying to come up with your own way of computing matrices, it may be that your loop works, but more standard "cleaner" way would be to use "numpy" module.
import numpy as np
A = np.array([[1,2,3],[2,3,4],[3,4,5]])
b = np.array([[1,-1,1]])
#Transposing
b = np.transpose(b)
c = np.matmult(A,b)

Partial derivatives of a function found using interp2d in python/sagemath

I have a function of two variables, R(t,r), that has been constructed using a list of values for R, t, and r. This function cannot be written down, the values are found from solving a differential equation (d R(t,r)/dt). I require to take the derivatives of the function, in particular, I need
dR(t,r)/dr, d^2R(t,r)/drdt. I have tried using this answer to do this, but I cannot seem to get an answer that makes sense. (note that all derivatives should be partials). Any help would be appreciated.
Edit:
my current code. I understand getting anything to work without the `Rdata' file is impossible but the file itself is 160x1001. Really, any data could be made up to get the rest to work. Z_t does not return answers that seem like the derivative of my original function based on what I know, therefore, I know it is not differentiating my function as I'd expect.
If there are numerical routines for using the array of data I do not mind, I simply need some way of figuring out the derivatives.
import numpy as np
from scipy import interpolate
data = np.loadtxt('Rdata.txt')
rvals = np.linspace(1,160,160)
tvals = np.linspace(0,1000,1001)
f = interpolate.interp2d(tvals, rvals, data)
Z_t = interpolate.bisplev(tvals, rvals, f.tck, dx=0.8, dy=0)

Quicker calculation of double integral in python (like MatLab's integral2)

I'm needing to perform a 2D-integration (one dimension has an infinite bound). In MatLab, I have done it with integral2:
int_x = integral2(fun, 0, inf, 0, a, 'abstol', 0, 'reltol', 1e-6);
In Python, I've tried scipy's dblquad:
int_x = scipy.integrate.dblquad(fun, 0, numpy.inf, lambda x: 0, lambda x: a, epsabs=0, epsrel=1e-6)
and have also tried using nested single quads. Unfortunately, both of the scipy options take ~80x longer than MatLab's.
My question is: is there a different implementation of 2D integrals within Python that might be faster (I've tried "quadpy" without much benefit)? Alternatively, could I compile MatLab's integral2 function and call it from python without needing the MatLab runtime (and is that even kosher)?
Thanks in advance!
Brad
Update:
Turns out that I don't have the "reputation" to post an image of the equation, so please bear with the formatting: fun(N,t) = P(N) N^2 S(N,t), where P(N) is a lognormal probability distribution and S(N,t) is fairly convoluted but is an exponential in its simplest form and a hypergeometric function (truncated series) in its most complex form. N is integrated from 0 to infinity and t from 0 to pi.
First, profile. If the profile tells you that it's evaluations if fun, then your best bet is to either numba.jit it, or rewrite it in Cython.
I created quadpy once because the the scipy quadrature functions were too slow for me. If you can bring your integrand into one of the respective forms (e.g., the 2D plane with weight function exp(-x) or exp(-x^2)), you should take a look.

Whats is the standard way to create a matrix of Sympy (symbolic) variables?

I was trying to find out the best or the standard way to create a Matrix (or even tensor if you want to go crazy though I dont require that) with Sympy Variables.
I'll describe the only way I've thought of doing it. I found the method symarray (here):
A = symarray('a', (3,4))
type(A)
<class 'numpy.ndarray'>
A
array([[a_0_0, a_0_1, a_0_2, a_0_3],
[a_1_0, a_1_1, a_1_2, a_1_3],
[a_2_0, a_2_1, a_2_2, a_2_3]], dtype=object)
and I also noticed that one can wrap it with the Matrix sympy function:
B = Matrix( symarray('b', (3,4)) )
type(B)
<class 'sympy.matrices.dense.MutableDenseMatrix'>
B
Matrix([
[b_0_0, b_0_1, b_0_2, b_0_3],
[b_1_0, b_1_1, b_1_2, b_1_3],
[b_2_0, b_2_1, b_2_2, b_2_3]])
is there any one of the two that is the standard way of doing it? Which is the best or the way people usually create matrices with sympy variables?
Your first method is a numpy object, the second a sympy object.
The difference will be clear when you do (matrix-) multiplication.
First try
sympy.pprint(A*A)
This will yield a 3x4 matrix with every element squared (element-wise multiplication).
Then try
sympy.pprint(B*B)
This will not work, because for matrix multiplication you need to have adequate dimensions. So try setting up B as a 4x4 matrix and you will get a result (matrix multiplication).
So which one to use depends on your use case. If you want to do real symbolic math, then I recommend sticking to the second method, keeping everything as sympy as possible. If you are more after numbercrunching ("typical usecase for numpy"), probably enhanced with some symbols, then use the first method.
EDIT
Looking at the (recent) documentation I think the most sympy way to create a matrix would be
C = sympy.MatrixSymbol('C', 4,4)
sympy.pprint(C)
sympy.pprint(C.as_explicit())
type(C)
You will notice that a simple print or sympy.pprint will not output all elements of the matrix, but rather just the matrix symbol. You will also notice that this method does not rely on the numpy package.

How can I improve python code performance using numpy

I have read this blog which shows how an algorithm had a 250x speed-up by using numpy. I have tried to improve the following code by using numpy but I couldn't make it work:
for i in nodes[1:]:
for lb in range(2, diameter+1):
not_valid_colors = set()
valid_colors = set()
for j in nodes:
if j == i:
break
if distances[i-1, j-1] >= lb:
not_valid_colors.add(c[j, lb])
else:
valid_colors.add(c[j, lb])
c[i, lb] = choose_color(not_valid_colors, valid_colors)
return c
Explanation
The code above is part of an algorithm used to calculate the self similar dimension of a graph. It works basically by constructing dual graphs G' where a node is connected to each other node if the distance between them is greater or equals to a given value (Lb) and then compute the graph coloring on those dual networks.
The algorithm description is the following:
Assign a unique id from 1 to N to all network nodes, without assigning any colors yet.
For all Lb values, assign a color value 0 to the node with id=1, i.e. C_1l = 0.
Set the id value i = 2. Repeat the following until i = N.
a) Calculate the distance l_ij from i to all the nodes in the network with id j less than i.
b) Set Lb = 1
c) Select one of the unused colors C[ j][l_ij] from all nodes j < i for which l_ij ≥ Lb . This is the color C[i][Lb] of node i for the given Lb value.
d) Increase Lb by one and repeat (c) until Lb = Lb_max.
e) Increase i by 1.
I wrote it in python but it takes more than a minute when try to use it with small networks which have 100 nodes and p=0.9.
As I'm still new to python and numpy I did not find the way to improve its efficiency.
Is it possible to remove the loops by using the numpy.where to find where the paths are longer than the given Lb? I tried to implement it but didn't work...
Vectorized operations with numpy arrays are fast since actual calculations are done with underlying libraries such as BLAS and LAPACK without Python overheads. With loop-intensive operations, you will not see those benefits.
You usually have to figure out a way to vectorize operations (usually possible with a smart use of array slicing). Some operations are inherently loop-intensive, however, and sometimes it is not easy to vectorize them (which seems to be the case for your code).
In those cases, you can first try Numba, which generates optimized machine code from a Python function without any modifications. (You just annotate the function and it will automatically do it for you). I do not have a lot of experience with it, and have not tried using this for complicated functions.
If this does not work, then you can use Cython, which converts Python-like code (with typed variables) into efficient C code automatically and generates a Python extension module that you can import and use in Python. That will usually give you at least an order of magnitude (usually two orders of magnitude) speedup for loop-intensive operations. I generally find Cython easy to use since unlike pure C, one can access your numpy arrays directly in Cython code.
I recommend using Anaconda Python distribution, since you will be able to install these packages easily. I'm sorry I don't have a specific answer for your code.
if you want to go to numpy, you can just change the lists into arrays,
for example distances[i-1][j-1] becomes distances[i-1, j-1] after you declare distances as a numpy array. same with c[i][lb]. About valid_colors and not_valid_colors you should think a bit more because with numpy arrays you cannot append things: the array have fixed length, so you should fix a maximum size before. Another idea is that after you have everything in numpy, you can cythonize your code http://docs.cython.org/src/tutorial/cython_tutorial.html it means that all your loops will become very fast. In any case, if you don't want cython and you look at the blog, you see that distances is declared as an array in the main()

Categories

Resources