Coefficients of Charpoly using Sympy in Python - python

I am new to using the library Sympy. I am need to extract all coefficients of the characteristic polynomial to be used later.
For example, my code is:
import sympy as sp
M = sp.Matrix([[0, 0, 0, 1, 0, 1], [0, 0, 0, 0, 1, 0], [0, 1, 0, 1, 0, -1], [1, 0, -1, 0, 1, 0], [0, 0, 0, 1, 0, 0], [-1, 0, 1, 0, 0, 0]])
lamda = symbols('lamda')
p = M.charpoly(lamda)
print(p)
print(p.coeffs())
which gives output:
PurePoly(lamda**6 + lamda**4 - lamda**2, lamda, domain='ZZ')
[1, 1, -1]
However, I need [1, 0, 1, 0, 1, 0, 0], which includes the zero coefficients of the lamda too the exponents 4, 3, 1, and 0, terms. I would normally use a for loop to iterate over the equation to see which terms are missing so a zero can be inserted into the appropriate spot in the array of coefficients. However, when I attempted to do so, I received an error saying PurePoly type doesn't support indexing. So, I was wondering if anyone knows how to make sympy include the zeros or a way to do it myself? I need will eventually have to incorporate this code into a loop for lots of matrices so I can't manually do it.
Thanks.

When I have questions like this I hope for some sort of intelligent naming of methods for objects and look through the directory of the object:
>>> print([w for w in dir(p) if 'coeff' in w])
['all_coeffs', 'as_coeff_Add', 'as_coeff_Mul', ...]
That all_coeffs is the one you want:
>>> help(p.all_coeffs)
Help on method all_coeffs in module sympy.polys.polytools:
all_coeffs(f) method of sympy.polys.polytools.PurePoly instance
Returns all coefficients from a univariate polynomial ``f``.
>>> p.all_coeffs()
[1,0,1,0,−1,0,0]

Related

Building an upper triangular matrix recursively

Ive been breaking my head over trying to come up with a recursive way to build the following matrix in python. It is quite a challenge without pointers. Could anyone maybe help me out?
The recursion is the following:
T0 = 1,
Tn+1 = [[Tn, Tn],
[ 0, Tn]]
I have tried many iterations of some recursive function, but I cannot wrap my head around it.
def T(n, arr):
n=int(n)
if n == 0:
return 1
else:
c = 2**(n-1)
Tn = np.zeros((c,c))
Tn[np.triu_indices(n=c)] = self.T(n=n-1, arr=arr)
return Tn
arr = np.zeros((8,8))
T(arr=arr, n=3)
It's not hard to do this, but you need to be careful about the meaning of the zero in the recursion. This isn't really precise for larger values of n:
Tn+1 = [[Tn, Tn],
[ 0, Tn]]
Because that zero can represent a block of zeros for example on the second iteration you have this:
[1, 1, 1, 1],
[0, 1, 0, 1],
[0, 0, 1, 1],
[0, 0, 0, 1]
Those four zeros in the bottom-left are all represented by the one zero in the formula. The block of zeros needs to be the same shape as the blocks around it.
After that it's a matter of making Numpy put thing in the right order and shape for you. numpy.block is really handy for this and makes it pretty simple:
import numpy as np
def makegasket(n):
if n == 0:
return np.array([1], dtype=int)
else:
node = makegasket(n-1)
return np.block([[node, node], [np.zeros(node.shape, dtype=int), node]])
makegasket(3)
Result:
array([[1, 1, 1, 1, 1, 1, 1, 1],
[0, 1, 0, 1, 0, 1, 0, 1],
[0, 0, 1, 1, 0, 0, 1, 1],
[0, 0, 0, 1, 0, 0, 0, 1],
[0, 0, 0, 0, 1, 1, 1, 1],
[0, 0, 0, 0, 0, 1, 0, 1],
[0, 0, 0, 0, 0, 0, 1, 1],
[0, 0, 0, 0, 0, 0, 0, 1]])
If you use larger n you might enjoy matplotlib.pyplot.imshow for display:
from matplotlib.pyplot import imshow
# ....
imshow(makegasket(7))
You don't really need a recursive function to implement this recursion. The idea is to start with the UR corner and build outward. You can even start with the UL corner to avoid some of the book-keeping and flip the matrix along either axis, but this won't be as efficient in the long run.
def build_matrix(n):
size = 2**n
# Depending on the application, even dtype=np.bool might work
matrix = np.zeros((size, size), dtype=np.int)
# This is t[0]
matrix[0, -1] = 1
for i in range(n):
k = 2**i
matrix[:k, -2 * k:-k] = matrix[k:2 * k, -k:] = matrix[:k, -k:]
return matrix
Just for fun, here is a plot of timing results for this implementation vs #Mark Meyer's answer. It shows the slight timing advantage (also memory) of using a looping approach in this case:
Both algorithms run out of memory around n=15 on my machine, which is not too surprising.

Matrix Expression from String

Context: I'm doing a bunch of simulations that require me to implement different Hamiltonians. These Hamiltonians are just matrices, built out of Kronecker products of some common elements, with some prefactors that I have to calculate based on the system parameters. E.g, using ⊗ for the Kronecker product
H = w1(a,b,c) * sigmax ⊗ I + w2(x,y,z)*I ⊗ sigmay
I was hoping I could make a simple parser that could read in the values of a,b,c,x,y,z and an expression for the Hamiltonian and construct the necessary matrix. Sympy seems like an obvious candidate, but I can't get a matrix expression to build using strings.
from sympy import symbols,Matrix,MatrixSymbol
from sympy.physics import msigma
from sympy.physics.quantum import TensorProduct
w1,w2 = symbols('w1 w2')
X1 = MatrixSymbol('X1',4,4)
X2 = MatrixSymbol('X2',4,4)
x = msigma(1)
x_1 = TensorProduct(eye(2),x)
x_2 = TensorProduct(x,eye(2))
exp = w1*X1 + w2*X2
exp.subs([(w1,0.5),(w2,2),(X1,x_1),(X2,x_2)]).as_explicit()
will work. But, trying
exp = MatrixExpr('w1*X1+w2*X2')
or
exp = MatrixExpr(sympify('w1*X1+w2*X2'))
or even
exp = sympify('w1*X1 + w2*X2')
exp.subs([(w1,0.5),(w2,2),(X1,x_1),(X2,x_2)])
won't.
It also won't work if I change w1 or w2 to be 1x1 instances of a MatrixSymbol.
What am I doing wrong here? This is my first time using sympy so I'm very clear that I may just be missing something.
Let's look what's going on in simpler case:
exp = sympify('w1*X1'); right_exp = w1*X1
type(exp), type(right_exp)
Out[47]: (sympy.core.mul.Mul, sympy.matrices.expressions.matmul.MatMul)
Looks like simpify doesn'y understand that X1 is a matrix. So, if we mention it explicit, everything will be allright:
exp = sympify("w1*MatrixSymbol('X1',4,4)")
exp.subs([(w1,0.5),(X1,x_1)]).as_explicit()
Out[49]:
Matrix([
[ 0, 0.5, 0, 0],
[0.5, 0, 0, 0],
[ 0, 0, 0, 0.5],
[ 0, 0, 0.5, 0]])
right_exp.subs([(w1,0.5),(X1,x_1)]).as_explicit()
Out[50]:
Matrix([
[ 0, 0.5, 0, 0],
[0.5, 0, 0, 0],
[ 0, 0, 0, 0.5],
[ 0, 0, 0.5, 0]])
And the final statement:
exp = sympify("w1*MatrixSymbol('X1',4,4)+w2*MatrixSymbol('X2',4,4)")
exp.subs([(w1,0.5),(w2,2),(X1,x_1),(X2,x_2)]).as_explicit()
Out[63]:
Matrix([
[ 0, 0.5, 2, 0],
[0.5, 0, 0, 2],
[ 2, 0, 0, 0.5],
[ 0, 2, 0.5, 0]])
What's going on? If you read Basics of expressions in SymPy you can find there statement that "matrices aren’t sympifiable" and simpify interprets X1 as a symbol.
It's hard to say how to behave in another situations. There are notes in docs that warn:
Sometimes autosimplification during sympification results in
expressions that are very different in structure than what was
entered.

Python: Generating from geometric distribution

Is this best way or most efficient way to generate random numbers from a geometric distribution with an array of parameters that may contain 0?
allids["c"]=[2,0,1,1,3,0,0,2,0]
[ 0 if x == 0 else numpy.random.geometric(1./x) for x in allids["c"]]
Note I am somewhat concerned about optimization.
EDIT:
A bit of context: I have an sequence of characters (i.e. ATCGGGA) and I would like to expand/contract runs of a single character (i.e. if original sequence had a run of 2 'A's I want to simulate a sequence that will have an expected value of 2 'A's, but vary according to a geometric distribution). All the characters that are runs of length 1 I do NOT want to be of variable length.
So if
seq = 'AATCGGGAA'
allids["c"]=[2,0,1,1,3,0,0,2,0]
rep=[ 0 if x == 0 else numpy.random.geometric(1./x) for x in allids["c"]]
"".join([s*r for r, s in zip(rep, seq)])
will output (when rep is [1, 0, 1, 1, 3, 0, 0, 1, 0])
"ATCGGGA"
You can use a masked array to avoid the division by zero.
import numpy as np
a = np.ma.masked_equal([2, 0, 1, 1, 3, 0, 0, 2, 0], 0)
rep = np.random.geometric(1. / a)
rep[a.mask] = 0
This generates a random sample for each element of a, and then deletes some of them later. If you're concerned about this waste of random numbers, you could generate just enough, like so:
import numpy as np
a = np.ma.masked_equal([2, 0, 1, 1, 3, 0, 0, 2, 0], 0)
rep = np.zeros(a.shape, dtype=int)
rep[~a.mask] = np.random.geometric(1. / a[~a.mask])
What about this:
counts = array([2, 0, 1, 1, 3, 0, 0, 2, 0], dtype=float)
counts_ma = numpy.ma.array(counts, mask=(counts == 0))
counts[logical_not(counts.mask)] = \
array([numpy.random.geometric(v) for v in 1.0 / counts[logical_not(counts.mask)]])
You could potentially precompute the distribution of homopolymer runs and limit the number of calls to geometric as fetching large numbers of values from RNGs is more efficient than individual calls

Why does SymPy give me the wrong answer when I row-reduce a symbolic matrix?

If I ask SymPy to row-reduce the singular matrix
nu = Symbol('nu')
lamb = Symbol('lambda')
A3 = Matrix([[-3*nu, 1, 0, 0],
[3*nu, -2*nu-1, 2, 0],
[0, 2*nu, (-1 * nu) - lamb - 2, 3],
[0, 0, nu + lamb, -3]])
print A3.rref()
then it returns the identity matrix
(Matrix([
[1, 0, 0, 0],
[0, 1, 0, 0],
[0, 0, 1, 0],
[0, 0, 0, 1]]), [0, 1, 2, 3])
which it shouldn't do, since the matrix is singular. Why is SymPy giving me the wrong answer and how can I get it to give me the right answer?
I know SymPy knows the matrix is singular, because when I ask for A3.inv(), it gives
raise ValueError("Matrix det == 0; not invertible.")
Furthermore, when I remove lamb from the matrix (equivalent to setting lamb = 0), SymPy gives the correct answer:
(Matrix([
[1, 0, 0, -1/nu**3],
[0, 1, 0, -3/nu**2],
[0, 0, 1, -3/nu],
[0, 0, 0, 0]]), [0, 1, 2])
which leads me to believe that this problem only happens with more than one variable.
EDIT: Interestingly, I just got the correct answer when I pass rref() the argument "simplify=True". I still have no idea why that is though.
The rref algorithm fundamentally requires the ability to tell if the elements of the matrix are identically zero. In SymPy, the simplify=True option instructs SymPy to simplify the entries first at the relevant stage of the algorithm. With symbolic entries, this is necessary, as you can easily have symbolic expressions that are identically zero but which don't simplify to such automatically, like x*(x - 1) - x**2 + x. The option is off by default because in general such simplification can be expensive, through this can be controlled by passing in a less general simplify function than simplify (for rational functions, use cancel). The defaults here could probably be smarter.

Is there a "bounding box" function (slice with non-zero values) for a ndarray in NumPy?

I am dealing with arrays created via numpy.array(), and I need to draw points on a canvas simulating an image. Since there is a lot of zero values around the central part of the array which contains the meaningful data, I would like to "trim" the array, erasing columns that only contain zeros and rows that only contain zeros.
So, I would like to know of some native numpy function or even a code snippet to "trim" or find a "bounding box" to slice only the data-containing part of the array.
(since it is a conceptual question, I did not put any code, sorry if I should, I'm very fresh to posting at SO.)
Thanks for reading
This should do it:
from numpy import array, argwhere
A = array([[0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0, 0],
[0, 0, 1, 1, 0, 0, 0],
[0, 0, 0, 0, 1, 0, 0],
[0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0]])
B = argwhere(A)
(ystart, xstart), (ystop, xstop) = B.min(0), B.max(0) + 1
Atrim = A[ystart:ystop, xstart:xstop]
The code below, from this answer runs fastest in my tests:
def bbox2(img):
rows = np.any(img, axis=1)
cols = np.any(img, axis=0)
ymin, ymax = np.where(rows)[0][[0, -1]]
xmin, xmax = np.where(cols)[0][[0, -1]]
return img[ymin:ymax+1, xmin:xmax+1]
The accepted answer using argwhere worked but ran slower. My guess is, it's because argwhere allocates a giant output array of indices. I tested on a large 2D array (a 1024 x 1024 image, with roughly a 50x100 nonzero region).
Something like:
empty_cols = sp.all(array == 0, axis=0)
empty_rows = sp.all(array == 0, axis=1)
The resulting arrays will be 1D boolian arrays. Loop on them from both ends to find the 'bounding box'.

Categories

Resources