How to calculate the divergent of a vector in sympy? - python

I want to calculate the divergent of a given vector with sympy. Is there any function in python responsible for this? I looked for something in the functions of einsteinpy, but I still haven't found any that help.
Basically I want to calculate \nabla_\mu (n v^\mu)=0 from a given vector v; n being a constant number.
\nabla_\mu (nv^\mu)=0 represents a divergence where \mu will take the derivative with respect to x, y or z of the vector element corresponding to the component. For example:
\nabla_\mu (n v^\mu) = \partial_x (u^x) + \partial_y(u^y) + \partial_z(u^z)
u can be something like (2x,4y,6z)
I appreciate any help.

As shown by #mikuszefski, you can use the module sympy.vector such that you have the implementation of the divergence in a space.
Another way to do what you want is to use the function derive_by_array to get a tensor and do einsten contraction.
import sympy as sp
x, y, z = sp.symbols("x y z") # dim = 3
# Now the functions that you want:
u, v, w = 2*x, 4*y, 6*z
# In a more general way, you can do:
u = sp.Function("u")(x, y, z)
v = sp.Function("v")(x, y, z)
w = sp.Function("w")(x, y, z)
U = sp.Array([u, v, w]) # U is a vector of dim = 3 (or sympy.Array)
X = sp.Array([x, y, z]) # X is a vector of dim = 3 (or sympy.Array)
dUdX = sp.derive_by_array(U, X) # dUdX is a tensor of dim = 3 and order = 2
# Frist way:
divU = sp.trace(sp.Matrix(sp.derive_by_array(U, X))) # Limited
# Second way:
divU = sp.tensorcontraction(sp.derive_by_array(U, X), (0, 1)) # More general
This solution works fine when dim = 2 for example, but you must have that len(X) == len(U)

Related

Understanding backpropagation in PyTorch

I am exploring PyTorch, and I do not understand the output of the following example:
# Initialize x, y and z to values 4, -3 and 5
x = torch.tensor(4., requires_grad = True)
y = torch.tensor(-3., requires_grad = True)
z = torch.tensor(5., requires_grad = True)
# Set q to sum of x and y, set f to product of q with z
q = x + y
f = q * z
# Compute the derivatives
f.backward()
# Print the gradients
print("Gradient of x is: " + str(x.grad))
print("Gradient of y is: " + str(y.grad))
print("Gradient of z is: " + str(z.grad))
Output
Gradient of x is: tensor(5.)
Gradient of y is: tensor(5.)
Gradient of z is: tensor(1.)
I have little doubt that my confusion originates with a minor misunderstanding. Can someone explain in a stepwise manner?
I hope you understand that When you do f.backward(), what you get in x.grad is .
In your case
.
So, simply (with preliminary calculus)
If you put your values for x, y and z, that explains the outputs.
But, this isn't really "Backpropagation" algorithm. This is just partial derivatives (which is all you asked in the question).
Edit:
If you want to know about the Backpropagation machinery behind it, please see #Ivan's answer.
I can provide some insights on the PyTorch aspect of backpropagation.
When manipulating tensors that require gradient computation (requires_grad=True), PyTorch keeps track of operations for backpropagation and constructs a computation graph ad hoc.
Let's look at your example:
q = x + y
f = q * z
Its corresponding computation graph can be represented as:
x -------\
-> x + y = q ------\
y -------/ -> q * z = f
/
z --------------------------/
Where x, y, and z are called leaf tensors. The backward propagation consists of computing the gradients of x, y, and y, which correspond to: dL/dx, dL/dy, and dL/dz respectively. Where L is a scalar value based on the graph output f. Each operation performed needs to have a backward function implemented (which is the case for all mathematically differentiable PyTorch builtins). For each operation, this function is effectively used to compute the gradient of the output w.r.t. the input(s).
The backward pass would look like this:
dL/dx <------\
x -----\ \
\ dq/dx
\ \ <--- dL/dq-----\
-> x + y = q ----\ \
/ / \ df/dq
/ dq/dy \ \ <--- dL/df ---
y -----/ / -> q * z = f
dL/dy <------/ / /
/ df/dz
z -------------------------/ /
dL/dz <--------------------------/
The "d(outputs)/d(inputs)" terms for the first operator are: dq/dx = 1, and dq/dy = 1. For the second operator they are df/dq = z, and df/dz = q.
Backpropagation comes down to applying the chain rule: dL/dx = dL/dq * dq/dx = dL/df * df/dq * dq/dx. Intuitively we decompose dL/dx in the opposite way than what backpropagation actually does, which to navigate bottom up.
Without shape considerations, we start from dL/df = 1. In reality dL/df has the shape of f (see my other answer linked below). This results in dL/dx = 1 * z * 1 = z. Similarly for y and z, we have dL/dy = z and dL/dz = q = x + y. Which are the results you observed.
Some answers I gave to related topics:
Understand PyTorch's graph generation
Meaning of grad_outputs in PyTorch's torch.autograd.grad
Backward function of the normalize operator
Difference between autograd.grad and autograd.backward
Understanding Jacobian tensors in PyTorch
you just got to understand what are the operations and what are the partial derivatives you should use to come at each, for example:
x = torch.tensor(1., requires_grad = True)
q = x*x
q.backward()
print("Gradient of x is: " + str(x.grad))
will give you 2, because the derivative of x*x is 2*x.
if we take your exemple for x, we have:
q = x + y
f = q * z
which can be modified as:
f = (x+y)*z = x*z+y*z
if we take the partial derivative of f in function of x, we endup with just z.
To come at this result you have to consider all other variables a constant and apply the derivative rules you already know.
But keep in mind, the process that pytorch executes to get these results are not symbolic or numeric differentiation, is Automatic differentiation, which is a computational method to efficiently get the gradients.
Take a closer look at:
https://www.cs.toronto.edu/~rgrosse/courses/csc321_2018/slides/lec10.pdf

How to scatter randomly points on a sphere

using PyPlot
n = 50
u = range(0,stop=2*π,length=n);
v = range(0,stop=π,length=n);
x = cos.(u) * sin.(v)';
y = sin.(u) * sin.(v)';
z = ones(n) * cos.(v)';
scatter3D(vec(x),vec(y),vec(z);c="red",s=1)
However, if I multiply vec(x), vec(y), vec(z) with rand() ,
I still get the same plot with the only difference being that the axis change or in other words that the sphere gets "squished".
using PyPlot
n = 50
u = range(0,stop=2*π,length=n);
v = range(0,stop=π,length=n);
x = cos.(u) * sin.(v)';
y = sin.(u) * sin.(v)';
z = ones(n) * cos.(v)';
scatter3D(rand()*vec(x),rand()*vec(y),rand()*vec(z);c="red",s=1)
The simplest approach seems to be sampling a Gaussian for each dimension and then normalizing the length of the resulting vector as described in this answer. There's a very slight possibility of getting a vector with zero length, which can be handled with rejection sampling. Putting that together you would do this:
points = map(1:n) do _
while true
x = randn()
y = randn()
z = randn()
l = hypot(x, y, z)
l ≠ 0 && return (x, y, z) ./ l
end
end
This gives a vector of 3-tuples each representing x, y and z coordinates of points, which you can plot as before. Separate vectors of coordinates can be extracted using comprehensions:
xs = [p[1] for p in points]
ys = [p[2] for p in points]
zs = [p[3] for p in points]
This approach can readily be generalized to any number of dimensions.

How to get the value of a middle variable in a function that need to use 'fsolve'?

My first py file is the function that I want to find the roots, like this:
def myfun(unknowns,a,b):
x = unknowns[0]
y = unknowns[1]
eq1 = a*y+b
eq2 = x**b
z = x*y + y/x
return eq1, eq2
And my second one is to find the value of x and y from a starting point, given the parameter value of a and b:
a = 3
b = 2
x0 = 1
y0 = 1
x, y = scipy.optimize.fsolve(myfun, (x0,y0), args= (a,b))
My question is: I actually need the value of z after plugging in the result of found x and y, and I don't want to repeat again z = x*y + y/x + ..., which in my real case it's a middle step variable without an explicit expression.
However, I cannot replace the last line of fun with return eq1, eq2, z, since fslove only find the roots of eq1 and eq2.
The only solution now is to rewrite this function and let it return z, and plug in x and y to get z.
Is there a good solution to this problem?
I believe that's the wrong approach. Since you have z as a direct function of x and y, then what you need is to retrieve those two values. In the listed case, it's easy enough: given b you can derive x as the inverse of eqn2; also given a, you can invert eqn1 to get y.
For clarity, I'm changing the names of your return variables:
ret1, ret2 = scipy.optimize.fsolve(myfun, (x0,y0), args= (a,b))
Now, invert the two functions:
# eq2 = x**b
x = ret2**(1/b)
# eq1 = a*y+b
y = (ret1 - b) / a
... and finally ...
z = x*y + y/x
Note that you should remove the z computation from your function, as it serves no purpose.

How to succinctly map over a plane with numpy

I have written code to plot the average squared error of a linear function over a given dataset, to visualise progress during a gradient descent training for the optimum regression line.
The relevant bits are these:
def compute_error(f, X, Y):
e = lambda x, y : (y - f(x))**2
return sum(e(x, y) for (x, y) in zip(X, Y))/len(X)
mn, bn, density = abs(target_slope)*1.5, abs(target_intercept)*1.5, 20
M, B = map(list, zip(*[(m, b) for m in np.linspace(-mn, +mn, density)
for b in np.linspace(-bn, +bn, density)]))
E = [compute_error(lambda x : m*x+b, X, Y) for m, b in zip(M,B)]
This works, but is very messy. I suspect there might be a very succinct way to pull off the same thing with numpy. So far I have gotten this:
M, B = map(np.ndarray.flatten, np.mgrid[-mn:+mn:1/density, -bn:+bn:1/density])
I still don't know how to improve the instantiation of E, and for some reason right now it is a lot slower than the messy version.
So, what would be a good way to map over a plane like MXB with numpy?
If you want to run above code you can build X and Y like so:
import numpy as np
from numpy.random import normal
target_slope = 3
target_intercept = 15
def generate_random_data(slope=1, minx=0, maxx=100, n=200, intercept=0):
f = lambda x : normal(slope*x, maxx/5)+intercept
X = np.linspace(minx, maxx, n)
Y = [f(x) for x in X]
return X, Y
X, Y = generate_random_data(slope=target_slope, intercept=target_intercept)
def compute_error(f, X, Y):
return np.mean( (Y - f(X))**2 )
MB = np.mgrid[-mn:+mn:2*mn/density, -bn:+bn:2*bn/density]
MB = MB.reshape((2, -1)).T
E = [compute_error(lambda x : m*x+b, X, Y) for m, b in MB]
It is possible to write a full numpy solution:
Y = np.array(Y)
M, B = np.mgrid[-mn:+mn:2*mn/density, -bn:+bn:2*bn/density]
mx = M.reshape((-1,1))*X
b = B.reshape((-1,1))*np.ones_like(X)
E = np.mean( (mx+b - Y)**2, axis=1 )
It may also be possible to write a solution without using the need to flatten the arrays and obtain the error as a 2D array...
I don't fully follow what you're trying to achieve here. However, this may help get you started with a numpy solution:
X, Y = generate_random_data(slope=target_slope, intercept=target_intercept, n=180)
M, B = np.mgrid[-mn:+mn:1/density, -bn:+bn:1/density]
f = M.T*X + B.T
error = np.sum((f-Y)**2)
Note I've had to alter the default number of X,Y values

Normal Equation Implementation in Python / Numpy

I've written some beginner code to calculate the co-efficients of a simple linear model using the normal equation.
# Modules
import numpy as np
# Loading data set
X, y = np.loadtxt('ex1data3.txt', delimiter=',', unpack=True)
data = np.genfromtxt('ex1data3.txt', delimiter=',')
def normalEquation(X, y):
m = int(np.size(data[:, 1]))
# This is the feature / parameter (2x2) vector that will
# contain my minimized values
theta = []
# I create a bias_vector to add to my newly created X vector
bias_vector = np.ones((m, 1))
# I need to reshape my original X(m,) vector so that I can
# manipulate it with my bias_vector; they need to share the same
# dimensions.
X = np.reshape(X, (m, 1))
# I combine these two vectors together to get a (m, 2) matrix
X = np.append(bias_vector, X, axis=1)
# Normal Equation:
# theta = inv(X^T * X) * X^T * y
# For convenience I create a new, tranposed X matrix
X_transpose = np.transpose(X)
# Calculating theta
theta = np.linalg.inv(X_transpose.dot(X))
theta = theta.dot(X_transpose)
theta = theta.dot(y)
return theta
p = normalEquation(X, y)
print(p)
Using the small data set found here:
http://www.lauradhamilton.com/tutorial-linear-regression-with-octave
I get the co-efficients: [-0.34390603; 0.2124426 ] using the above code instead of: [24.9660; 3.3058]. Could anyone help clarify where I am going wrong?
You can implement normal equation like below:
import numpy as np
X = 2 * np.random.rand(100, 1)
y = 4 + 3 * X + np.random.randn(100, 1)
X_b = np.c_[np.ones((100, 1)), X] # add x0 = 1 to each instance
theta_best = np.linalg.inv(X_b.T.dot(X_b)).dot(X_b.T).dot(y)
X_new = np.array([[0], [2]])
X_new_b = np.c_[np.ones((2, 1)), X_new] # add x0 = 1 to each instance
y_predict = X_new_b.dot(theta_best)
y_predict
This assumes X is an m by n+1 dimensional matrix where x_0 always = 1 and y is a m-dimensional vector.
import numpy as np
step1 = np.dot(X.T, X)
step2 = np.linalg.pinv(step1)
step3 = np.dot(step2, X.T)
theta = np.dot(step3, y) # if y is m x 1. If 1xm, then use y.T
Your implementation is correct. You've only swapped X and y (look closely how they define x and y), that's why you get a different result.
The call normalEquation(y, X) gives [ 24.96601443 3.30576144] as it should.
Here is the normal equation in one line:
theta = np.dot(np.linalg.inv(np.dot(X.T,X)),np.dot(X.T,Y))

Categories

Resources