Understanding numpy vectorization - python

I came across performing calculation for euclidian distance using numpy vectorization, here. Calculation done is:
>>> tri = np.array([[1, 1],
... [3, 1],
... [2, 3]])
>>> np.sum(tri**2, axis=1) ** 0.5 # Or: np.sqrt(np.sum(np.square(tri), 1))
array([1.4142, 3.1623, 3.6056])
So, to understand, I tried:
>>> np.sum(tri**2, axis=1)
array([ 2, 10, 13])
So basically, tri**2 is squaring each element: [[1,1],[9,1],[4,9]]. Next, we sum each sub-array element to get [1+1, 9+1, 4+9] = [2,10,13]
Then we take square root of each of them.
But I didnt get where are we doing the subtraction qi-pi as in the formula? Also I felt we should be getting single value: √((1-1)^2+(9-1)^2+(4-9)^2)=9.43
Am I missing some maths here or python / numpy understanding?

Assuming you have two vectors p and q represented as np.array:
dist = np.sqrt(np.sum((q - p) ** 2))
There is also np.linalg.norm which computes the same thing:
assert np.isclose(dist, np.linalg.norm(q - p))

Related

Python equivalent of (matrix)*(vector) in R

In R, when I execute the code below:
> X=matrix(1,2,3)
> c=c(1,2,3)
> X*c
R gives out the following output:
[,1] [,2] [,3]
[1,] 1 3 2
[2,] 2 1 3
But when I do the below on Python:
>>> import numpy as np
>>> X=np.array([[1,1,1],[1,1,1]])
>>> c=np.array([1,2,3])
>>> X*c
the Python code above gives the following output:
array([[1, 2, 3],
[1, 2, 3]])
Is there any way that I can make the Python to come up with the identical output as R? I think I somehow have to tell Python that I want the numpy to multiply each element of the matrix X by each element of the vector c along the column, instead of along the row, but I am not sure how to go about this.
In [18]: np.reshape([1,2,3]*2,(2,3),order='F')
Out[18]:
array([[1, 3, 2],
[2, 1, 3]])
This starts with a list multiply, which is replication:
In [19]: [1,2,3]*2
Out[19]: [1, 2, 3, 1, 2, 3]
The rest uses numpy to reshape it into a (2,3) array, but with consecutive values going down, 'F' order.
Not knowning R, and in particular the c(1,2,3) expression, I can't say that's what's going on in R.
===
You talk about rows with columns, but I don't see how that works in your example. That said, we can easily perform outer like products
===
This reproduces your R_Product (at least in a few test cases):
In [138]: def foo(X,c):
...: X1 = X.ravel()
...: Y = np.resize(c,X1.shape)*X1
...: return Y.reshape(X.shape, order='F')
...:
In [139]: foo(np.ones((2,3)),np.arange(1,4))
Out[139]:
array([[1., 3., 2.],
[2., 1., 3.]])
In [140]: foo(np.arange(6).reshape(2,3),np.arange(1,4))
Out[140]:
array([[ 0, 6, 8],
[ 2, 3, 15]])
I'm using the resize function to replicate c to match the total number of elements of X. And order F to stack them in the desired column order. The default for numpy is order C.
In numpy replicating an array to match another is not common, at least not in this sense. Replicating by row or column, as in broadcasting is common. And of course reshaping.
I am the OP.
I was looking for a quick and easy solution, but I guess there is no straightforward functionality in Python that allows us to do this. So, I had to make a function that multiplies a matrix with a vector in the same manner that R does:
def R_product(X,c):
"""
Computes the regular R product
(not same as the matrix product) between
a 2D Numpy Array X, and a numpy vector c.
Args:
X: 2D Numpy Array
c: A Numpy vector
Returns: the output of X*c in R.
(This is different than X/*/c in R)
"""
X_nrow = X.shape[0]
X_ncol = X.shape[1]
X_dummy = np.zeros(shape=((X_nrow * X_ncol),1))
nrow = X_dummy.shape[0]
nc = nrow // len(c)
Y = np.zeros(shape=(nrow,1))
for j in range(X_ncol):
for u in range(X_nrow):
X_element = X[u,j]
if u == X_nrow - 1:
idx = X_nrow * (j+1) - 1
else:
idx = X_nrow * j + (u+1) - 1
X_dummy[idx,0] = X_element
for i in range(nc):
for j in range(len(c)):
Y[(i*len(c)+j):(i*len(c)+j+1),:] = (X_dummy[(i*len(c)+j):(i*len(c)+j+1),:]) * c[j]
for z in range(nrow-nc*len(c)):
Y[(nc*len(c)+z):(nc*len(c)+z+1),:] = (X_dummy[(nc*len(c)+z):(nc*len(c)+z+1),:]) * c[z]
return Y.reshape(X_ncol, X_nrow).transpose() # the answer I am looking for
Should work.

Python Sympy dot with the conjugate not same as norm squared

Why does sympy give false on the second Boolean? It correctly gives true on the first Boolean. I thought this last line would be the definition of the norm.
from sympy import *
eta_1, eta_2, m = 1, 1, 3
theta_1, theta_2 = symbols("theta_1 theta_2", real=True)
sigma_x = Matrix([[0, 1], [1, 0]])
sigma_y = Matrix([[0, -I], [I, 0]])
sigma_z = Matrix([[1, 0], [0, -1]])
H = eta_1*sin(theta_1)*sigma_x + eta_2*sin(theta_2)*sigma_y + (m-eta_1*cos(theta_1)-eta_2*cos(theta_2))*sigma_z
v = H.eigenvects()
l = v[0][0]
v = v[0][2][0]
n_normal = v/v.norm()
print(simplify(n_normal.norm()**2) == 1)
print(simplify(n_normal.dot(n_normal.H)))
print(simplify(n_normal.dot(n_normal.H)) == 1)
I think this has to do with the fact that sympy fails in that
simplify(abs(x**2)-x*conjugate(x))==0
gives false. Is there some other way to go around this problem, another way to define an inner product that does behave correctly. I'm doing some complicated physics calculations for my thesis and I would really like to check my results with sympy.
PS. I'm using sympy version 1.4dev.
Edit: I think the problem is with the fact that simplify doesn't realize that
$2*\cos(\theta_1)*cos(\theta_2) - 6*\cos(\theta_1) - 6*\cos(\theta_2) + 11>0\:.$
If I replace this in $n_normal$ with its absolute value it works. I think it is weird that the norm function does this correctly and the simplify of what essentially should be the norm doesn't.

How to find the eigenvalues and eigenvectors of a matrix with SymPy?

I want to calculate the eigenvectors x from a system A by using this: A x = λ x
The problem is that I don't know how to solve the eigenvalues by using SymPy.
Here is my code. I want to get some values for x1 and x2 from matrix A
from sympy import *
x1, x2, Lambda = symbols('x1 x2 Lambda')
I = eye(2)
A = Matrix([[0, 2], [1, -3]])
equation = Eq(det(Lambda*I-A), 0)
D = solve(equation)
print([N(element, 4) for element in D]) # Eigenvalus in decimal form
print(pretty(D)) # Eigenvalues in exact form
X = Matrix([[x1], [x2]]) # Eigenvectors
T = A*X - D[0]*X # The Ax = %Lambda X with the first %Lambda = D[0]
print(pretty(solve(T, x1, x2)))
The methods eigenvals and eigenvects is what one would normally use here.
A.eigenvals() returns {-sqrt(17)/2 - 3/2: 1, -3/2 + sqrt(17)/2: 1} which is a dictionary of eigenvalues and their multiplicities. If you don't care about multiplicities, use list(A.eigenvals().keys()) to get a plain list of eigenvalues.
The output of eigenvects is a bit more complicated, and consists of triples (eigenvalue, multiplicity of this eigenvalue, basis of the eigenspace). Note that the multiplicity is algebraic multiplicity, while the number of eigenvectors returned is the geometric multiplicity, which may be smaller. The eigenvectors are returned as 1-column matrices for some reason...
For your matrix, A.eigenvects() returns the eigenvector [-2/(-sqrt(17)/2 + 3/2), 1] for the eigenvalue -3/2 + sqrt(17)/2, and eigenvector [-2/(3/2 + sqrt(17)/2), 1] for eigenvalue -sqrt(17)/2 - 3/2.
If you want the eigenvectors presented as plain lists of coordinates, the following
[list(tup[2][0]) for tup in A.eigenvects()]
would output [[-2/(-sqrt(17)/2 + 3/2), 1], [-2/(3/2 + sqrt(17)/2), 1]]. (Note this just picks one eigenvector for each eigenvalue, which is not always what you want)
sympy has a very convenient way of getting eigenvalues and eigenvectors: sympy-doc
Your example would simply become:
from sympy import *
A = Matrix([[0, 2], [1, -3]])
print(A.eigenvals()) #returns eigenvalues and their algebraic multiplicity
print(A.eigenvects()) #returns eigenvalues, eigenvects
This answer will help you when you all eignvectors, the solution above doesnt always give you all eienvectos for example this matrix A used below
# the matrix
A = Matrix([
[4, 0, 1],
[2, 3, 2],
[1, 0, 4]
])
sym_eignvects = []
for tup in sMatrix.eigenvects():
for v in tup[2]:
sym_eignvects.append(list(v))

Python: fast residuals computing

What is the most efficient way to compute the residuals of two numpy arrays?
I'm doing this the next way:
def residuals(array1, array2):
sum = 0.
for i in xrange(len(lane1)):
sum += (lane1[i] - lane2[i])**2
return sum
And I'm wondering if there is any other better solutions?
Yes, note that you can perform mathematical operations directly on arrays and they are applied element-wise:
>>> import numpy as np
>>> arr1 = np.array((1, 2, 3))
>>> arr2 = np.array((4, 5, 6))
# differences
>>> arr1 - arr2
array([-3, -3, -3])
# squared differences
>>> (arr1 - arr2) ** 2
array([9, 9, 9])
# sum of squared differences
>>> np.sum((arr1 - arr2) ** 2)
27

Calculating "generating functions" with numpy

In mathematics, a "generating function" is defined from a sequence of numbers c0, c1, c2, ..., cn by c0+c1*x+c2*x^2 + ... + cn*x^n. These come as "moment generating functions", "probability generating functions" and various other types, depending on the source of the coefficient.
I have an array of the coefficients and I'd like a quick way to create the corresponding generating function.
I could do
import numpy as np
myArray = np.array([1,2,3,4])
x=0.2
sum([c*x**k for k,c in enumerate myArray])
or I could have an array having c[k] in the kth entry. It seems there should be a fast numpy way to do this.
Unfortunately attempts to look this up are complicated by the fact that "generate" and "function" are common words in programming, as is the combination "generating function" so I haven't had any luck with search engines.
x = .2
coeffs = np.array([1,2,3,4])
Make an array of the degree of each term
degrees = np.arange(len(coeffs))
Raise x the each degree
terms = np.power(x, degrees)
Multiply the coefficients and sum
result = np.sum(coeffs*terms)
>>> coeffs
array([1, 2, 3, 4])
>>> degrees
array([0, 1, 2, 3])
>>> terms
array([ 1. , 0.2 , 0.04 , 0.008])
>>> result
1.552
>>>
As a function:
def f(coeffs, x):
degrees = np.arange(len(coeffs))
terms = np.power(x, degrees)
return np.sum(coeffs*terms)
Or simply us the Numpy Polynomial Package
from numpy.polynomial import Polynomial as P
p = P(coeffs)
result = p(x)
If you are looking for performance, using np.einsum could be suggested too -
np.einsum('i,i->',myArray,x**np.arange(myArray.size))
>>> coeffs = np.random.random(5)
>>> coeffs
array([ 0.70632473, 0.75266724, 0.70575037, 0.49293719, 0.66905641])
>>> x = np.random.random()
>>> x
0.7252944971757169
>>> powers = np.arange(0, coeffs.shape[0], 1)
>>> powers
array([0, 1, 2, 3, 4])
>>> result = coeffs * x ** powers
>>> result
array([ 0.70632473, 0.54590541, 0.37126147, 0.18807659, 0.18514853])
>>> np.sum(result)
1.9967167252487628
Using numpys Polynomial class is probably the easiest way.
from numpy.polynomial import Polynomial
coefficients = [1,2,3,4]
f = Polynomial( coefficients )
You can then use the object like any other function.
import numpy as np
import matplotlib.pyplot as plt
print f( 0.2 )
x = np.linspace( -5, 5, 51 )
plt.plot( x , f(x) )

Categories

Resources