Need help putting a minimization equation into scipy weighted least squares solver - python

My question is how can I put a weighted least squares problem into a python solver. I'm trying to implement the approaches in the paper found here (PDF warning). There is an overview of the problem at the bottom of the post.
Specifically I want to start with the following minimization equation (19 in the paper):
latex formula can be found here:
\frac{min}{\Theta \epsilon M} \sum_{j=1}^{n} \sum_{i=1}^{m}(w(i,j))\left | \Psi(i,j)*\Theta (i,j) - I(i,j) \right |^{2}
It is represented as a weighted least squares problem.
w, psi, and I are my knowns, and I am trying to solve for theta.
I tried at first creating a function that takes a theta and returns the sum of this equation exactly as it's expressed above. Then I passed it to scipy.optimize.least_squares, but the theta values always remained the same after optimization. I tried implementing a jacobian, but the resulting sum explodes to huge negative values. It also takes ages as I'm attempting to run this on images (I is the pixel value for a pixel j with light i).
I then realized I'm almost certainly misunderstanding how to solve this problem and could use some help approaching it. My current code is below:
def theta_solver(self, theta):
imshape = self.images.shape
sm = 0
for j in j_array:
for i in i_array:
w = self.get_w(i, j, theta)
psi = self.non_diff_smoothing(self.get_psi(i, j))
diff = psi*(theta[i, j]) - self.I[i, j]
res = w*(diff)
sm += res
return sm
def solve_theta(self, theta_guess):
res = scipy.optimize.least_squares(self.theta_solver, theta_guess)
Something tells me I'm way off base for how I'm approaching this problem, and I could use a finger in the right direction. Thanks for your time.
Problem overview:
This particular vision approach is called photometric stereo. By taking several images of a scene with different light sources, we can create a 3D reconstruction of that scene.
One issue is the 1/r^2 decay in lighting is dependent on distance from the light source, which means this can't be solved by normal linear solutions.
The approach documented in the paper is a nonlinear approach for solving near light photometric stereo. It does two things:
it solves the surface Z, and
the albedos/intensities at each pixel represented by theta, by alternating the solvers.
In this question I'm only trying to solve the theta element of the equation, which can be solved via weighted least squares.

Turns out I was heavily overthinking the problem. This can be decomposed to a simple linear solution of the form Ax = b. When looking at an error equation, in this case:
argmin(THETA) sum(W * ||PSI * THETA - I||^2)
we can just distribute the weight through the parts within the root mean square. Our equation ends up being:
W * PSI * THETA = W * I
Which we can solve using your favorite linear solver (i.e. conjugate gradient descent)

Related

Solving PDE on 1D cylindrical coordinates with FiPy

Let me start by saying that I have found similar problems to mine on the NARKIVE FiPy mailing list archive but since the equations won't load, they are not very useful. For example Convection-diffusion problem on a 1D cylindrical grid, or on another mailing list archive Re: FiPy Heat Transfer Solution. In the second linked mail Daniel says:
There are two ways to solve on a cylindrical domain in FiPy. You can either
use the standard diffusion equation in Cartesian coordinates (2nd equation
below) and with a mesh that is actually cylindrical in shape or you can use
the diffusion equation formulated on a cylindrical coordinate system (1st
equation below) and use a standard 2D / 1D grid mesh.
And the equations are not there. In this case it is actually fine because I understand the first solution and I want to use that.
I want to solve the following equation on a 1D cylindrical grid (sorry I don't have 10 reputation yet so I cannot post the nice rendered equations):
with boundary conditions:
where rho_core is the left side of the mesh, and rho_edge is the right side of the mesh. rho is the normalized radius, J is the Jacobian:
R is the real radius in meters, so the dimension of the Jacobian is distance. The initial conditions doesn't really matter, but in my code example I will use a numerical Dirac-delta at R=0.8.
I have a working example without(!) the Jacobian, but it's quite long, and it doesn't use FiPy's Viewers so I'll link a gist: https://gist.github.com/leferi99/142b90bb686cdf5116ef5aee425a4736
The main part in question is the following:
import fipy as fp ## finite volume PDE solver
from fipy.tools import numerix ## requirement for FiPy, in practice same as numpy
import copy ## we need the deepcopy() function because some FiPy objects are mutable
import numpy as np
import math
## numeric implementation of Dirac delta function
def delta_func(x, epsilon, coeff):
return ((x < epsilon) & (x > -epsilon)) * \
(coeff * (1 + numerix.cos(numerix.pi * x / epsilon)) / (2 * epsilon))
rho_from = 0.7 ## normalized inner radius
rho_to = 1. ## normalized outer radius
nr = 1000 ## number of mesh cells
dr = (rho_to - rho_from) / nr ## normalized distance between the centers of the mesh cells
duration = 0.001 ## length of examined time evolution in seconds
nt = 1000 ## number of timesteps
dt = duration / nt ## length of one timestep
## 3D array for storing the density with the correspondant normalized radius values
## the density values corresponding to the n-th timestep will be in the n-th line
solution = np.zeros((nt,nr,2))
## loading the normalized radial coordinates into the array
for j in range(nr):
solution[:,j,0] = (j * dr) + (dr / 2) + rho_from
mesh = fp.CylindricalGrid1D(dx=dr, nx=nr) ## 1D mesh based on the normalized radial coordinates
mesh = mesh + (0.7,) ## translation of the mesh to rho=0.7
n = fp.CellVariable(mesh=mesh) ## fipy.CellVariable for the density solution in each timestep
diracLoc = 0.8 ## location of the middle of the Dirac delta
diracCoeff = 1. ## Dirac delta coefficient ("height")
diracPercentage = 2 ## width of Dirac delta (full width from 0 to 0) in percentage of full examined radius
diracWidth = int((nr / 100) * diracPercentage)
## diffusion coefficient
diffCoeff = fp.CellVariable(mesh=mesh, value=100.)
## convection coefficient - must be a vector
convCoeff = fp.CellVariable(mesh=mesh, value=(1000.,))
## applying initial condition - uniform density distribution
n.setValue(1)
## boundary conditions
gradLeft = (0.,) ## density gradient (at the "left side of the radius") - must be a vector
valueRight = 0. ## density value (at the "right end of the radius")
n.faceGrad.constrain(gradLeft, where=mesh.facesLeft) ## applying Neumann boundary condition
n.constrain(valueRight, mesh.facesRight) ## applying Dirichlet boundary condition
convCoeff.setValue(0, where=mesh.x<(R_from + dr)) ## convection coefficient 0 at the inner edge
## the PDE
eq = (fp.TransientTerm() == fp.DiffusionTerm(coeff=diffCoeff)
- fp.ConvectionTerm(coeff=convCoeff))
## Solving the PDE and storing the data
for i in range(nt):
eq.solve(var=n, dt=dt)
solution[i,0:nr,1]=copy.deepcopy(n.value)
My code can solve the following equation with the same boundary conditions as indicated above:
To keep it simple I use spatially independent coefficients with the only exeption on the inner edge, where the convection coefficient is 0, and the diffusion coefficient is almost 0. In the linked code I am using a uniform distribution initial condition.
My first question is why do I get the exact same results when using fipy.Grid1D and fipy.CylindricalGrid1D? I should get different results, right? How should I rewrite my code for it to be able to differentiate between the simple 1D Grid and the 1D Cylindrical Grid?
My actual problem is not with this exact code, I just wanted to simplify my problem, but as indicated in the comments this code doesn't produce the same results with the different Grids. So I will just post a GitHub link to a Jupyter Notebook, which may stop working in the future.
The Jupyter Notebook If you want to run it, the first code cell should be run first and after that only the very last cell is relevant. Ignore the reference images. The line plots show the diffusion and convection coefficients. When I ran the last cell with Grid1D or CylindricalGrid1D I got the same results (I compared the plots very precisely)
Sorry but I just cannot rename all my variables, so I hope that based on my comment, and the changed code above (I changed the comments in the code too) you can understand what I'm trying to do.
My other question is regarding the Jacobian. How can I implement it? I've looked at the only example in the documentation which uses a Jacobian, but that Jacobian is a matrix and also it uses the scipy.optimize.fsolve() function.
[cobbling an answer from the discussion in the comments]
The results are similar between a Grid1D and a CylindricalGrid1D, particularly in the early steps, but they are not the same. They are quite different as the problem evolves.
FiPy doesn't like things outside the divergence, but you should be able to multiply the equation by J and put it in the coefficient of the TransientTerm, e.g.,
or
eq = fp.TransientTerm(J) == fp.DiffusionTerm(coeff=J * diffCoeff) - fp.ConvectionTerm(coef=J * convCoeff)
For the Jacobian, you could create a CellVariable for the real radius in terms of the normalized radius, and then take its gradient:
real_radius = fp.CellVariable(mesh=mesh, value=...)
J = real_radius.grad.dot([[1]])
.grad returns a vector, even in 1D, but the coefficient must be scalar, so take the dot product to get the x component.

Faster Multigrid Poission Equation Solver?

I am trying to make my own CFD solver and one of the most computationally expensive parts is solving for the pressure term. One way to solve Poisson differential equations faster is by using a multigrid method. The basic recursive algorithm for this is:
function phi = V_Cycle(phi,f,h)
% Recursive V-Cycle Multigrid for solving the Poisson equation (\nabla^2 phi = f) on a uniform grid of spacing h
% Pre-Smoothing
phi = smoothing(phi,f,h);
% Compute Residual Errors
r = residual(phi,f,h);
% Restriction
rhs = restriction(r);
eps = zeros(size(rhs));
% stop recursion at smallest grid size, otherwise continue recursion
if smallest_grid_size_is_achieved
eps = smoothing(eps,rhs,2*h);
else
eps = V_Cycle(eps,rhs,2*h);
end
% Prolongation and Correction
phi = phi + prolongation(eps);
% Post-Smoothing
phi = smoothing(phi,f,h);
end
I've attempted to implement this algorithm myself however it is very slow and doesn't give good results so evidently it is doing something wrong. I've been trying to find why for too long and I think it's just worthwhile seeing if anyone can help me.
If I use a grid size of 2^5 by 2^5 points, then it can solve it and give reasonable results. However, as soon as I go above this it takes exponentially longer to solve and basically get stuck at some level of inaccuracy, no matter how many V-Loops are performed. at 2^7 by 2^7 points, the code takes way too long to be useful.
I think my main issue is that my implementation of a jacobian iteration is using linear algebra to calculate the update at each step. This should, in general, be fast however, the update matrix A is an n*m sized matrix, and calculating the dot product of a 2^7 * 2^7 sized matrix is expensive. As most of the cells are just zeros, should I calculate the result using a different method?
if anyone has any experience in multigrid methods, I would appreciate any advice!
Thanks

scipy.linalg.sparse.eigsh does not work for generalised eigenvalues

I'm working on a machine learning project which involves doing a Principal Component Analysis on some labeled data and using those labels to extract more valuable information from the data.
To do that, I'm calculating a scatter matrix for each class, and for each pair of classes I need to solve a generalised eigenvalue problem for their scatter matrices, as follows:
S_i * v = w * (S_j + b.I) * v
where b is a multiplier and I is the identity matrix. Now, this is the code in python:
jeigenvalues = eigsh(scatter_j, k=10, return_eigenvectors=False, maxiter=100)
print('eigenvalues made')
beta = betaMult*mean(jeigenvalues)
print(beta)
print(scatter_j+beta*eye(shape(x_data)[1]))
w, v = eigsh(scatter_i,M=scatter_j+beta*eye(shape(x_data)[1]),k=int(numberOfEVs/45), maxiter=100)
print(i,j,'done')
numberOfEVs is 90 in my current code (so that it's divisible by 45).
But the problem is, at the line where I use the eigsh for the aforementioned formula, it never gives me an answer. It keeps eating more and more memory without even completing a single iteration (I set its maxiter input to 1, and it still didn't give an answer). When I don't give the eigsh function the M argument (which is the matrix on the right side of the generalised EV problem and it is assumed to be "I" when not specified), it works correctly. But when M is provided, it becomes unresponsive.
Any ideas?
EDIT: The scatter matrices have rather small entries, mostly around 10^-5. I've also tried multiplying the left hand side by the inverse of the RHS matrix, and again it's having the same issue (goes on for a long time without an answer). Is the smallness of these entries the issue? How can I solve it, then?

Rotated Paraboloid Surface Fitting

I have a set of experimentally determined (x, y, z) points which correspond to a parabola. Unfortunately, the data is not aligned along any particular axis, and hence corresponds to a rotated parabola.
I have the following general surface:
Ax^2 + By^2 + Cz^2 + Dxy + Gyz + Hzx + Ix + Jy + Kz + L = 0
I need to produce a model that can represent the parabola accurately using (I'm assuming) least squares fitting. I cannot seem to figure out how this works. I have though of rotating the parabola until its central axis lines up with z-axis but I do not know what this axis is. Matlab's cftool only seems to fit equations of the form z = f(x, y) and I am not aware of anything in python that can solve this.
I also tried solving for the parameters numerically. When I tried making this into a matrix equation and solving by least squares, the matrix turned out to be invertible and hence my parameters were just all zero. I also am stuck on this and any help would be appreciated. I don't really mind the method as I am familiar with matlab, python and linear algebra if need be.
Thanks
Dont use any toolboxes, GUIs or special functions for this problem. Your problem is very common and the equation you provided may be solved in a very straight-forward manner. The solution to the linear least squares problem can be outlined as:
The basis of the vector space is x^2, y^2, z^2, xy, yz, zx, x, y, z, 1. Therefore your vector has 10 dimensions.
Your problem may be expressed as Ap=b, where p = [A B C D E F G H I J K L]^T is the vector containing your parameters. The right hand side b should be all zeros, but will contain some residual due to model errors, uncertainty in the data or for numerical reasons. This residual has to be minimized.
The matrix A has a dimension of N by 10, where N denotes the number of known points on surface of the parabola.
A = [x(1)^2 y(1)^2 ... y(1) z(1) 1
...
x(N)^2 y(N)^2 ... y(N) z(N) 1]
Solve the overdetermined system of linear equations by computing p = A\b.
Do you have enough data points to fit all 10 parameters - you will need at least 10?
I also suspect that 10 parameters are to many to describe a general paraboloid, meaning that some of the parameters are dependent. My fealing is that a translated and rotated paraboloid needs 7 parameters (although I'm not really sure)

High frequency noise at solving differential equation

I'm trying to simulate a simple diffusion based on Fick's 2nd law.
from pylab import *
import numpy as np
gridpoints = 128
def profile(x):
range = 2.
straggle = .1576
dose = 1
return dose/(sqrt(2*pi)*straggle)*exp(-(x-range)**2/2/straggle**2)
x = linspace(0,4,gridpoints)
nx = profile(x)
dx = x[1] - x[0] # use np.diff(x) if x is not uniform
dxdx = dx**2
figure(figsize=(12,8))
plot(x,nx)
timestep = 0.5
steps = 21
diffusion_coefficient = 0.002
for i in range(steps):
coefficients = [-1.785714e-3, 2.539683e-2, -0.2e0, 1.6e0,
-2.847222e0,
1.6e0, -0.2e0, 2.539683e-2, -1.785714e-3]
ccf = (np.convolve(nx, coefficients) / dxdx)[4:-4] # second order derivative
nx = timestep*diffusion_coefficient*ccf + nx
plot(x,nx)
for the first few time steps everything looks fine, but then I start to get high frequency noise, do to build-up from numerical errors which are amplified through the second derivative. Since it seems to be hard to increase the float precision I'm hoping that there is something else that I can do to suppress this? I already increased the number of points that are being used to construct the 2nd derivative.
I don't have the time to study your solution in detail, but it seems that you are solving the partial differential equation with a forward Euler scheme. This is pretty easy to implement, as you show, but this can become numerical instable if your timestep is too small. Your only solution is to reduce the timestep or to increase the spatial resolution.
The easiest way to explain this is for the 1-D case: assume your concentration is a function of spatial coordinate x and timestep i. If you do all the math (write down your equations, substitute the partial derivatives with finite differences, should be pretty easy), you will probably get something like this:
C(x, i+1) = [1 - 2 * k] * C(x, i) + k * [C(x - 1, i) + C(x + 1, i)]
so the concentration of a point on the next step depends on its previous value and the ones of its two neighbors. It is not too hard to see that when k = 0.5, every point gets replaced by the average of its two neighbors, so a concentration profile of [...,0,1,0,1,0,...] will become [...,1,0,1,0,1,...] on the next step. If k > 0.5, such a profile will blow up exponentially. You calculate your second order derivative with a longer convolution (I effectively use [1,-2,1]), but I guess that does not change anything for the instability problem.
I don't know about normal diffusion, but based on experience with thermal diffusion, I would guess that k scales with dt * diffusion_coeff / dx^2. You thus have to chose your timestep small enough so that your simulation does not become instable. To make the simulation stable, but still as fast as possible, chose your parameters so that k is a bit smaller than 0.5. Something similar can be derived for 2-D and 3-D cases. The easiest way to achieve this is to increase dx, since your total calculation time will scale with 1/dx^3 for a linear problem, 1/dx^4 for 2-D problems, and even 1/dx^5 for 3-D problems.
There are better methods to solve diffusion equations, I believe that Crank Nicolson is at least standard for solving heat-equations (which is also a diffusion problem). The 'problem' is that this is an implicit method, which means that you have to solve a set of equations to calculate your 'concentration' at the next timestep, which is a bit of a pain to implement. But this method is guaranteed to be numerical stable, even for big timesteps.

Categories

Resources