I have a Numpy matrix, for example, numpy.matrix([[-1, 2],[1, -2]], dtype='int'). I want to get its integer-valued eigenvectors, if any; for example, numpy.array([[-1], [1]]) for the above matrix. What Numpy returns are eigenvectors in floating numbers, scaled to have unit length.
One can do this in Sage, where one can specify the field (i.e., data type) of the matrix and operations done on the matrix will respect the field one specifies.
Any idea of how to do this nicely in Python? Many thanks in advance.
I am personally content with the following solution: I called sage in Python and let sage compute what I want. sage, being math-oriented, is rather versatile in computations involving fields other than reals.
Below is my script compute_intarrs.py and it requires sage be installed. Be aware it is a little slow.
import subprocess
import re
import numpy as np
# construct a numpy matrix
mat = np.matrix([[1,-1],[-1,1]])
# convert the matrix into a string recognizable by sage
matstr = re.sub('\s|[a-z]|\(|\)', '', mat.__repr__())
# write a (sage) python script "mat.py";
# for more info of the sage commands:
# www.sagemath.org/doc/faq/faq-usage.html#how-do-i-import-sage-into-a-python-script
# www.sagemath.org/doc/tutorial/tour_linalg.html
f = open('mat.py', 'w')
f.write('from sage.all import *\n\n')
f.write('A = matrix(ZZ, %s)\n\n' % matstr)
f.write('print A.kernel()') # this returns the left nullspace vectors
f.close()
# call sage and run mat.py
p = subprocess.Popen(['sage', '-python', 'mat.py'], stdout=subprocess.PIPE)
# process the output from sage
arrstrs = p.communicate()[0].split('\n')[2:-1]
arrs = [np.array(eval(re.sub('(?<=\d)\s*(?=\d|-)', ',', arrstr)))
for arrstr in arrstrs]
print arrs
Result:
In [1]: %run compute_intarrs.py
[array([1, 1])]
You can do some pretty cool things with dtype = object and the fractions.Fraction class, e.g.
>>> A = np.array([fractions.Fraction(1, j) for j in xrange(1, 13)]).reshape(3, 4)
>>> A
array([[1, 1/2, 1/3, 1/4],
[1/5, 1/6, 1/7, 1/8],
[1/9, 1/10, 1/11, 1/12]], dtype=object)
>>> B = np.array([fractions.Fraction(1, j) for j in xrange(1, 13)]).reshape(4, 3)
>>> B
array([[1, 1/2, 1/3],
[1/4, 1/5, 1/6],
[1/7, 1/8, 1/9],
[1/10, 1/11, 1/12]], dtype=object)
>>> np.dot(A, B)
array([[503/420, 877/1320, 205/432],
[3229/11760, 751/4620, 1217/10080],
[1091/6930, 1871/19800, 1681/23760]], dtype=object)
Unfortunately the np.linalg module converts everything to float before doing anything, so you can't expect to get solutions directly as integers or rationals. But you can always do the following after your computations:
def scale_to_int(x) :
fracs = [fractions.Fraction(j) for j in x.ravel()]
denominators = [j.denominator for j in fracs]
lcm = reduce(lambda a, b: max(a, b) / fractions.gcd(a, b) * min(a, b),
denominators)
fracs = map(lambda x : lcm * x, fracs)
gcd = reduce(lambda a, b: fractions.gcd(a, b), fracs)
fracs = map(lambda x: x / gcd, fracs)
return np.array(fracs).reshape(x.shape)
It will be slow, and very sensitive to round-off errors:
>>> scale_to_int(np.linspace(0, 1, 5)) # [0, 0.25, 0.5, 0.75, 1]
array([0, 1, 2, 3, 4], dtype=object)
>>> scale_to_int(np.linspace(0, 1, 4)) # [0, 0.33333333, 0.66666667, 1]
array([0, 6004799503160661, 12009599006321322, 18014398509481984], dtype=object)
You could mitigate some of that using the limit_denominator method of Fraction, but probably will not be all that robust.
Related
I came across performing calculation for euclidian distance using numpy vectorization, here. Calculation done is:
>>> tri = np.array([[1, 1],
... [3, 1],
... [2, 3]])
>>> np.sum(tri**2, axis=1) ** 0.5 # Or: np.sqrt(np.sum(np.square(tri), 1))
array([1.4142, 3.1623, 3.6056])
So, to understand, I tried:
>>> np.sum(tri**2, axis=1)
array([ 2, 10, 13])
So basically, tri**2 is squaring each element: [[1,1],[9,1],[4,9]]. Next, we sum each sub-array element to get [1+1, 9+1, 4+9] = [2,10,13]
Then we take square root of each of them.
But I didnt get where are we doing the subtraction qi-pi as in the formula? Also I felt we should be getting single value: √((1-1)^2+(9-1)^2+(4-9)^2)=9.43
Am I missing some maths here or python / numpy understanding?
Assuming you have two vectors p and q represented as np.array:
dist = np.sqrt(np.sum((q - p) ** 2))
There is also np.linalg.norm which computes the same thing:
assert np.isclose(dist, np.linalg.norm(q - p))
Helle I want to do some summation on a numpy array like this
import numpy as np
import sympy as sy
import cv2
i, j = sy.symbols('i j', Integer=True)
#next read some grayscale image to create a numpy array of pixels
a = cv2.imread(filename)
b = sy.summation(sy.summation(a[i][j], (i,0,1)), (j,0,1)) #double summation
but I'm facing with an error. is it possible to handle numpy symbols as numpy arrays'indexes? if not can you sugest me a solution?
Thanks.
You can't use numpy object directly in SymPy expressions, because numpy objects don't know how to deal with symbolic variables.
Instead, create the thing you want symbolically using SymPy objects, and then lambdify it. The SymPy version of a numpy array is IndexedBase, but it seems there is a bug with it, so, since your array is 2-dimensional, you can also use MatrixSymbol.
In [49]: a = MatrixSymbol('a', 2, 2) # Replace 2, 2 with the size of the array
In [53]: i, j = symbols('i j', integer=True)
In [50]: f = lambdify(a, Sum(a[i, j], (i, 0, 1), (j, 0, 1)))
In [51]: b = numpy.array([[1, 2], [3, 4]])
In [52]: f(b)
Out[52]: 10
(also note that the correct syntax for creating integer symbols is symbols('i j', integer=True), not symbols('i j', Integer=True)).
Note that you have to use a[i, j] instead of a[i][j], which isn't supported.
MatrixSymbol is limited to 2-dimensional matrices. To generalize to arrays of
any dimension, you can generate the expression with IndexedBase. lambdify is
currently incompatible with IndexedBase, but it can be used with
DeferredVectors. So the trick is pass a DeferredVector to lambdify:
import sympy as sy
import numpy as np
a = sy.IndexedBase('a')
i, j, k = sy.symbols('i j k', integer=True)
s = sy.Sum(a[i, j, k], (i, 0, 1), (j, 0, 1), (k, 0, 1))
f = sy.lambdify(sy.DeferredVector('a'), s)
b = np.arange(24).reshape(2,3,4)
result = f(b)
expected = b[:2,:2,:2].sum()
assert expected == result
In mathematics, a "generating function" is defined from a sequence of numbers c0, c1, c2, ..., cn by c0+c1*x+c2*x^2 + ... + cn*x^n. These come as "moment generating functions", "probability generating functions" and various other types, depending on the source of the coefficient.
I have an array of the coefficients and I'd like a quick way to create the corresponding generating function.
I could do
import numpy as np
myArray = np.array([1,2,3,4])
x=0.2
sum([c*x**k for k,c in enumerate myArray])
or I could have an array having c[k] in the kth entry. It seems there should be a fast numpy way to do this.
Unfortunately attempts to look this up are complicated by the fact that "generate" and "function" are common words in programming, as is the combination "generating function" so I haven't had any luck with search engines.
x = .2
coeffs = np.array([1,2,3,4])
Make an array of the degree of each term
degrees = np.arange(len(coeffs))
Raise x the each degree
terms = np.power(x, degrees)
Multiply the coefficients and sum
result = np.sum(coeffs*terms)
>>> coeffs
array([1, 2, 3, 4])
>>> degrees
array([0, 1, 2, 3])
>>> terms
array([ 1. , 0.2 , 0.04 , 0.008])
>>> result
1.552
>>>
As a function:
def f(coeffs, x):
degrees = np.arange(len(coeffs))
terms = np.power(x, degrees)
return np.sum(coeffs*terms)
Or simply us the Numpy Polynomial Package
from numpy.polynomial import Polynomial as P
p = P(coeffs)
result = p(x)
If you are looking for performance, using np.einsum could be suggested too -
np.einsum('i,i->',myArray,x**np.arange(myArray.size))
>>> coeffs = np.random.random(5)
>>> coeffs
array([ 0.70632473, 0.75266724, 0.70575037, 0.49293719, 0.66905641])
>>> x = np.random.random()
>>> x
0.7252944971757169
>>> powers = np.arange(0, coeffs.shape[0], 1)
>>> powers
array([0, 1, 2, 3, 4])
>>> result = coeffs * x ** powers
>>> result
array([ 0.70632473, 0.54590541, 0.37126147, 0.18807659, 0.18514853])
>>> np.sum(result)
1.9967167252487628
Using numpys Polynomial class is probably the easiest way.
from numpy.polynomial import Polynomial
coefficients = [1,2,3,4]
f = Polynomial( coefficients )
You can then use the object like any other function.
import numpy as np
import matplotlib.pyplot as plt
print f( 0.2 )
x = np.linspace( -5, 5, 51 )
plt.plot( x , f(x) )
I have an array comprised of N 3x3 arrays (a collection of matrices, although the data type is np.ndarray) and I have an array comprised of N 3x1 arrays (a collection of vectors). What I want to do is multiply each matrix by each vector, so I expect to get back N 3x1 arrays.
Simple example:
A = np.ones((6,3,3))
B = np.ones((6,3,1))
np.dot(A,B) # This gives me a 6x3x6x1 array, which is not what I want
np.array(map(np.dot,A,B)) # This gives me exactly what I want, but I don't want to have to rely on map
I've tired all kinds of reshaping, explored einsum, etc., but can't get this to work the way I want it to. How do I get this to work with numpy broadcasting? This operation will ultimately need to be done many thousands of times, and I don't want map or list comprehension operations to slow things down.
You can use np.einsum to calculate the dot products and create the matrix of the desired shape:
np.einsum('ijk,ikl->ijl', A, B)
One could use the built-in matrix multiplication in Python 3.5 or above,
introduced in PEP 465.
$ python --version
Python 3.6.6
>>> import numpy as np
>>> A = np.ones((6,3,3))
>>> B = np.ones((6,3,1))
>>> C = A # B
>>> print(C)
[[[3.]
[3.]
[3.]]
[[3.]
[3.]
[3.]]
[[3.]
[3.]
[3.]]
[[3.]
[3.]
[3.]]
[[3.]
[3.]
[3.]]
[[3.]
[3.]
[3.]]]
A = np.random.rand(6, 3, 3)
B = np.random.rand(6, 3, 1)
C = np.array(map(np.dot, A, B))
D = np.sum(A*B.swapaxes(1, 2), axis=2)[..., None]
assert np.allclose(C, D)
assert C.shape == D.shape == (6, 3, 1)
The "allclose" is because there's some floating point rounding difference between the two methods on the order of 1e-16.
The .swapaxis and the [..., None] are just to get the arrays to conform to the shapes you specified. You could also represent it more simply with:
A = np.random.rand(6, 3, 3)
B = np.random.rand(6, 3)
C = np.array(map(np.dot, A, B))
D = np.sum(A*B[:, None, :], axis=2)
assert np.allclose(C, D)
assert C.shape == D.shape == (6, 3)
How should I compute the pseudo-inverse of a matrix using sympy (not using numpy, because the matrix has symbolic constants and I want the inverse also in symbolic). The normal inv() does not work for a non-square matrix in sympy. For example if M = Matrix(2,3, [1,2,3,4,5,6]), pinv(M) should give
-0.9444 0.4444
-0.1111 0.1111
0.7222 -0.2222
I think since this is all symbolic it should be OK to use the text-book formulas taught in a linear algebra class (e.g. see the list of special cases in the Wikipedia article on the Moore–Penrose pseudoinverse). For numerical evaluation pinv uses the singular value decomposition (svd) instead.
You have linearly independent rows (full row rank), so you can use the formula for a 'right' inverse:
>>> import sympy as sy
>>> M = sy.Matrix(2,3, [1,2,3,4,5,6])
>>> N = M.H * (M * M.H) ** -1
>>> N.evalf(4)
[-0.9444, 0.4444]
[-0.1111, 0.1111]
[ 0.7222, -0.2222]
>>> M * N
[1, 0]
[0, 1]
For full column rank, replace M with M.H, transpose the result, and simplify to get the following formula for the 'left' inverse:
>>> M = sy.Matrix(3, 2, [1,2,3,4,5,6])
>>> N = (M.H * M) ** -1 * M.H
>>> N.evalf(4)
[-1.333, -0.3333, 0.6667]
[ 1.083, 0.3333, -0.4167]
>>> N * M
[1, 0]
[0, 1]