I have a 2D numpy array C which contains the coefficients of a 2d polynomial, such that the polynomial is given by the sum over all coefficients:
c[i,j]*x^i*y^j
How can I find the roots of this 2d polynomial?
It seems that numpy.roots only works for 1d polynomials.
This is a polynomial in two variables. In general there will be infinitely many roots (think about all the values of x and y that will yield xy=0), so an algorithm that gives you all the roots cannot exist.
It is correct as Jussi Nurminen pointed out that a multivariate polynomial does not have a finite number of roots in the general case, however it is still possible to find functions that yield all the (infinitely many) roots of a multivariate polynomial.
One solution would be to use sympy:
import numpy as np
import sympy as sp
np.random.seed(1234)
deg = (2, 3) # degrees of polynomial
x = sp.symbols(f'x:{len(deg)}') # free variables
c = np.random.uniform(-1, 1, deg) # coefficients of polynomial
eq = (np.power(np.array(x), np.indices(deg).transpose((*range(1, len(deg) + 1), 0))).prod(-1) * c).sum()
sol = sp.solve(eq, x, dict=True)
for i, s in enumerate(sol):
print(f'solution {i}:')
for k, v in s.items():
print(f' {k} = {sp.simplify(v).evalf(3)}')
# solution 0:
# x0 = (1.25e+14*x1**2 - 2.44e+14*x1 + 6.17e+14)/(-4.55e+14*x1**2 + 5.6e+14*x1 + 5.71e+14)
In this question I asked for a way to compute the closest projected point to a hyperbolic paraboloid using python.
Thanks to the answer, I was able to use the code below to calculate the closest point to multiple paraboloids.
from scipy.optimize import minimize
# This function calculate the closest projection on a hyperbolic paraboloid
# As Answered by #Jaime https://stackoverflow.com/questions/18858448/speeding-up-a-closest-point-on-a-hyperbolic-paraboloid-algorithm
def fun_single(x, p0, p1, p2, p3, p):
u, v = x
s = u*(p1-p0) + v*(p3-p0) + u*v*(p2-p3-p1+p0) + p0
return np.linalg.norm(p-s)
# Example use case:
# Generate some random data for 3 random hyperbolic paraboloids
# A real life use case will count in the tens of thousands.
import numpy as np
COUNT = 3
p0 = np.random.random_sample((COUNT,3))
p1 = np.random.random_sample((COUNT,3))
p2 = np.random.random_sample((COUNT,3))
p3 = np.random.random_sample((COUNT,3))
p = np.random.random_sample(3)
uv = []
for i in xrange(COUNT):
uv.append(minimize(fun_single, (0.5, 0.5), (p0[i], p1[i], p2[i], p3[i], p)).x)
uv = np.array(uv)
# UV projections for my random data
#[[ 0.34109572 4.39237344]
# [-0.2720813 0.17083423]
# [ 0.48993333 -0.99415568]]
Now that I have a projection for each item it's possible to find more useful info, such as which of the given items is closest to the query point, find its array index and derive more data from it, etc...
The problem with calling minimize for each item is that it becomes very slow when dealing with hundreds of thousands of items. So to try to resolve the issue I took a crack at changing the function to work with many inputs.
from numpy.core.umath_tests import inner1d
# This function calculate the closest projection to many hyperbolic paraboloids
def fun_array(x, p0, p1, p2, p3, p):
u, v = x
s = u*(p1-p0) + v*(p3-p0) + u*v*(p2-p3-p1+p0) + p0
V = p-s
return np.min(np.sqrt(inner1d(V,V)))
# Lets pass all the data to minimize
uv = minimize(fun_array, (0.5, 0.5), (p0, p1, p2, p3, p)).x
# Result: [ 0.25090064, 1.19732181]
# This corresponds to index 2 of my random data,
# which is the closest projection.
Minimizing the function fun_array is much faster than the iterative approach, but it only returns the single closest projection, not all projections.
QUESTION
Is it possible to use minimize to return all projections as with the iterative approach? And if not, is it at least possible to get the index of the "winning" array element?
The strict answer
You have to be tricky but it's not that difficult to trick minimize. The point is that minimize only works for scalar cost functions. But we can get away with summing up all your distances, since they are naturally nonnegative quantities and the global minimum is defined by the configuration where each distance is minimal. So instead of asking for the minimum points of COUNT bivariate scalar functions, instead we ask for the minimum of a single scalar function of COUNT*2 variables. This just happens to be the sum of COUNT bivariate functions. But note that I'm not convinced that this will be faster, because I can imagine higher-dimensional minimum searches to be less stable than a corresponding set of lower-dimensional independent minimum searches.
What you should definitely do is pre-allocate memory for uv and insert values into that, rather than growing a list item by item a lot of times:
uv = np.empty((COUNT,2))
for i in range(COUNT):
uv[i,:] = minimize(fun_single, (0.5, 0.5), (p0[i], p1[i], p2[i], p3[i], p)).x
Anyway, in order to use a single call to minimize we only need to vectorize your function, which is easier than you'd think:
def fun_vect(x, p0, p1, p2, p3, p):
x = x.reshape(-1,2) # dimensions are mangled by minimize() call
u,v = x.T[...,None] # u,v shaped (COUNT,1) for broadcasting
s = u*(p1-p0) + v*(p3-p0) + u*v*(p2-p3-p1+p0) + p0 # shape (COUNT,3)
return np.linalg.norm(p-s, axis=1).sum() # sum up distances for overall cost
x0 = 0.5*np.ones((COUNT,2))
uv_vect = minimize(fun_vect, x0, (p0, p1, p2, p3, p)).x.reshape(-1,2)
This function, as you see, extend the scalar one along columns. Each row corresponds to an independent minimization problem (consistently with your definition of the points). The vectorization is straightforward, the only nontrivial part is that we need to play around with dimensions to make sure that everything broadcasts nicely, and we should take care to reshape x0 on input because minimize has a habit of flattening the array-valued input position. And of course the final result has to be reshaped again. Correspondingly, an array of shape (COUNT,2) has to be provided as x0, this is the only feature from which minimize can deduce the dimensionality of your problem.
Comparison for my random data:
>>> uv
array([[-0.13386872, 0.14324999],
[ 2.42883931, 0.55099395],
[ 1.03084756, 0.35847593],
[ 1.47276203, 0.29337082]])
>>> uv_vect
array([[-0.13386898, 0.1432499 ],
[ 2.42883952, 0.55099405],
[ 1.03085143, 0.35847888],
[ 1.47276244, 0.29337179]])
Note that I changed COUNT to be 4, because I like to keep every dimension distinct when testing. This way I can be sure that I run into an error if I mess up my dimensions. Also note that in general you might want to keep the complete object returned by minimize just to make sure that everything went fine and converged.
A more useful solution
As we discussed in comments, the above solution---while perfectly answers the question---is not particularly feasible, since it takes too long to run, much longer than doing each minimization separately. The problem was interesting enough that it got me thinking. Why not try to solve the problem as exactly as possible?
What you're trying to do (now considering a single hyperboloid and a query point q) is finding the s(u,v) point with the parametrization by Jaime
s(u,v) = p0 + u * (p1 - p0) + v * (p3 - p0) + u * v * (p2 - p3 - p1 + p0)
for which the distance d(s,q) is minimal. Since the distance is a proper metric (in particular, it is non-negative), this is equivalent to minimizing d(s,q)^2. So far so good.
Let's rewrite the parametrized equation of s by introducing a few constant vectors in order to simplify the derivation:
s(u,v) = p0 + u*a + v*b + u*v*c
s - q = p0-q0 + u*a + v*b + u*v*c
= d + u*a + v*b + u*v*c
d(s,q)^2 = (s-q)^2
(In this section ^ will represent the power, because this is linear algebra.) Now, the minimum of the distance function is a stationary point, so in the u_min,v_min point we're looking for the gradient of s(u,v) with respect to u and v is zero. This is equivalent to saying that the derivative of d(s,q)^2 with respect to both u and v has to be simultaneously zero; this gives us two nonlinear equations with the unknowns u and v:
2*(s-q)*ds/du = 0 (1)
2*(s-q)*ds/dv = 0 (2)
Expanding these two equations is a somewhat tedious job. The first equation happens to be linear in u, the second in v. I collected all the terms containing u in the first equation, which gave me the relationship
u(v) = (-v^2*b.c - v*(c.d + a.b) - a.d)/(a + v*c)^2
where . represents the dot product. The above equation tells us that for whatever v we choose, equation (1) will exactly be satisfied if u is chosen thus. So we have to solve equation (2).
What I did was expand all the terms in equation (2), and substitute u(v) into u. The original equation had polynomial terms of 1,u,v,uv,u^2,u^2v, so I can tell you this is not pretty. With some minor assumptions of no divergence (which divergences would probably correspond to the equivalent of vertical lines in the case of a line fitting problem), we can arrive at the following beautiful equation:
(b.d + v*b^2)*f^2 - (c.d + a.b + 2*v*b.c)*e*f + (a.c + v*c^2)*e^2 = 0
with the new scalars defined as
e = v^2*b.c + v*(c.d + a.b) + a.d
f = (a + v*c)^2 = (a^2 + 2*v*a.c + v^2*c^2)
Whatever v solves this equation, the corresponding (u(v),v) point will correspond to a stationary point of the distance. We should first note that this equation considers the root of a fifth-order polynomial if v. There's guaranteed to be at least one real root, and in the worst case there can be as many as 5 real roots. Whether these correspond to minima, maxima, or (in unlikely cases) saddle points is open for discussion.
The real benefit of the above result is that we have a fighting chance of finding all the roots of the equation! This is a huge deal, since nonlinear root searching/minimization will in general give you only one root at a time, without being able to tell you if you've missed any. Enter numpy.polynomial.polynomial.polyroots. Despite all the linear algebra fluff surrounding it, we're only looking for the (at most 5!) root of a polynomial, for which we can test the distances and choose the global minimum (if necessary). If there's only one root, we can be sure that it's the minimum based on geometrical considerations.
Note that I haven't mentioned a caveat yet: the polynomial library can only work with one polynomial at a time. We will still have to loop over each hyperboloid manually. But here's the deal: we will be able to guarantee that we're finding the exact minimum, rather than unknowingly accepting local distance minima. And it might even be faster than minimize. Let's see:
import numpy as np
# generate dummy inputs
COUNT = 100
p0 = np.random.random_sample((COUNT,3))
p1 = np.random.random_sample((COUNT,3))
p2 = np.random.random_sample((COUNT,3))
p3 = np.random.random_sample((COUNT,3))
p = np.random.random_sample(3)
def mydot(v1,v2):
"""generalized dot product for multidimensional arrays: (...,N,3)x(...,N,3) -> (...,N,1)"""
# (used in u_from_v for vectorized dot product)
return np.einsum('...j,...j->...',v1,v2)[...,None]
def u_from_v(v, a, b, c, d):
"""return u(v) corresponding to zero of gradient"""
# use mydot() instead of dot to enable array-valued v input
res = (- v**2*mydot(b,c) - v*(mydot(c,d)+mydot(a,b)) - mydot(a,d))/np.linalg.norm(a+v*c, axis=-1, keepdims=True)**2
return res.squeeze()
def check_distance(uv, p0, p1, p2, p3, p):
"""compute the distance from optimization results to query point"""
u,v = uv.T[...,None]
s = u*(p1-p0) + v*(p3-p0) + u*v*(p2-p3-p1+p0) + p0
return np.linalg.norm(p-s, axis=-1)
def poly_for_v(a, b, c, d):
"""return polynomial representation of derivative of d(s,p)^2 for the parametrized s(u(v),v) point"""
# only works with a scalar problem:( one polynomial at a time
# v is scalar, a-b-c-d are 3-dimensional vectors (for a given paraboloid)
# precompute scalar products appearing multiple times in the formula
ab = a.dot(b)
ac = a.dot(c)
cc = c.dot(c)
cd = c.dot(d)
bc = b.dot(c)
Poly = np.polynomial.polynomial.Polynomial
e = Poly([a.dot(d), cd+ab, bc])
f = Poly([a.dot(a), 2*ac, cc])
res = Poly([b.dot(d), b.dot(b)])*f**2 - Poly([cd+ab,2*bc])*e*f + Poly([ac,cc])*e**2
return res
def minimize_manually(p0, p1, p2, p3, p):
"""numpy polynomial version for the minimization problem"""
# auxiliary arrays, shape (COUNT,3)
a = p1 - p0
b = p3 - p0
c = p2 - p3 - p1 + p0
d = p0 - p
# preallocate for collected result
uv_min = np.empty((COUNT,2))
for k in range(COUNT):
# collect length-3 vectors needed for a given surface
aa,bb,cc,dd = (x[k,:] for x in (a,b,c,d))
# compute 5 complex roots of the derivative distance
roots = poly_for_v(aa, bb, cc, dd).roots()
# keep exactly real roots
vroots = roots[roots.imag==0].real
if vroots.size == 1:
# we're done here
vval, = vroots
uval = u_from_v(vval, aa, bb, cc, dd)
uv_min[k,:] = uval,vval
else:
# need to find the root with minimal distance
uvals = u_from_v(vroots[:,None], aa, bb, cc, dd)
uvtmp = np.stack((uvals,vroots),axis=-1)
dists = check_distance(uvtmp, p0[k,:], p1[k,:], p2[k,:], p3[k,:], p)
winner = np.argmin(dists) # index of (u,v) pair of minimum
uv_min[k,:] = uvtmp[winner,:]
return uv_min
uv_min = minimize_manually(p0, p1, p2, p3, p)
# for comparison with the minimize-based approaches:
# distances = check_distance(uv_manual,p0,p1,p2,p3,p))
The above example has COUNT of 100, but if you start with COUNT=1 and keep running both the minimize version and the above exact version, you'll see roughly once in every 10-20 runs that the minimize-based approach misses the real minimum. So the above is safer, it's guaranteed to find the proper minima.
I also did some timing checks with COUNT=100, around 100 ms for the polynomial-based solution, around 200 ms for the minimize-based looping version. With COUNT=1000: 1 second for the polynomial, 2 seconds for the looping minimize-based. Considering how even for larger problems the above is both more precise and more efficient, I see no reason why not to use this instead.
I have been trying to solve the second order non-linear differential equation for Newton's Law of Universal Gravitation (inverse square law):
x(t)'' = -GM/(x**2)
for the motion of a satellite approaching the earth (in this case a point-mass) in one dimension
using numpy.odeint with a series of first order differential equations, but the operation has been yielding incorrect results when compared to Mathematica or to simplified forms of the law (∆x = (1/2)at^2).
This is the code for the program:
import numpy as np
from scipy.integrate import odeint
def deriv(x, t): #derivative function, where x[0] is x, x[1] is x' or v, and x2 = x'' or a
x2 = -mu/(x[0]**2)
return x[1], x2
init = 6371000, 0 #initial values for x and x'
a_t = np.linspace(0, 20, 100) #time scale
mu = 398600000000000 #gravitational constant
x, _ = odeint(deriv, init, a_t).T
sol = np.column_stack([a_t, x])
which yields an array with coupled a_t and x position values as the satellite approaches the earth from an initial distance of 6371000 m (the average radius of the earth). One would expect, for instance, that the object would take approximately 10 seconds to fall 1000 m at the surface from 6371000m to 6370000m (because the solution to 1000 = 1/2(9.8)(t^2) is almost exactly 10), and the mathematica solution to the differential equation puts the value at slightly above 10s as well.
Yet that value according the odeint solution and the sol array is nearly 14.4.
t x
[ 1.41414141e+01, 6.37001801e+06],
[ 1.43434343e+01, 6.36998975e+06],
Is there significant error in the odeint solution, or is there a major problem in my function/odeint usage? Thanks!
(because the solution to 1000 = 1/2(9.8)(t^2) is almost exactly 10),
This is the right sanity check, but something's off with your arithmetic. Using this approximation, we get a t of
>>> (1000 / (1/2 * 9.8))**0.5
14.285714285714285
as opposed to a t of ~10, which would give us only a distance of
>>> 1/2 * 9.8 * 10**2
490.00000000000006
This expectation of ~14.29 is very close to the result you observe:
>>> sol[abs((sol[:,1] - sol[0,1]) - -1000).argmin()]
array([ 1.42705427e+01, 6.37000001e+06])
Is there a quick method of expanding and solving binomials raised to fractional powers in Scypy/numpy?
For example, I wish to solve the following equation
y * (1 + x)^4.8 = x^4.5
where y is known (e.g. 1.03).
This requires the binomial expansion of (1 + x)^4.8.
I wish to do this for millions of y values and so I'm after a nice and quick method to solve this.
I've tried the sympy expand (and simplification) but it seems not to like the fractional exponent. I'm also struggling with the scipy fsolve module.
Any pointers in the right direction would be appreciated.
EDIT:
By far the simplest solution I found was to generate a truth table (https://en.wikipedia.org/wiki/Truth_table) for assumed values of x (and known y). This allows fast interpolation of the 'true' x values.
y_true = np.linspace(7,12, 1e6)
x = np.linspace(10,15, 1e6)
a = 4.5
b = 4.8
y = x**(a+b) / (1 + x)**b
x_true = np.interp(y_true, y, x)
EDIT: Upon comparing output with that of Woldfram alpha for y=1.03, it looks like fsolve will not return complex roots. https://stackoverflow.com/a/15213699/3456127 is a similar question that may be of more help.
Rearrange your equation: y = x^4.5 / (1+x)^4.8.
Scipy.optimize.fsolve() requires a function as its first argument.
Either:
from scipy.optimize import fsolve
import math
def theFunction(x):
return math.pow(x, 4.5) / math.pow( (1+x) , 4.8)
for y in millions_of_values:
fsolve(theFunction, y)
Or using lambda (anonymous function construct):
from scipy.optimize import fsolve
import math
for y in millions_of_values:
fsolve((lambda x: (math.pow(x, 4.5) / math.pow((1+x), 4.8))), y)
I don't think you need the binomial expansion. Horner's method for evaluating polynomials implies it is better to have a factored form of the polynomials than an expanded form.
In general, nonlinear equation solving can benefit from symbolic differentiation, which is not too difficult to do by hand for your equation. Providing an analytical expression for the derivative saves the solver from having to estimate the derivatives numerically. You can write two functions: one that returns the value of the function and another that returns the derivative (i.e. the Jacobian of the function for this simple 1-D function), as described in the docs for scipy.optimize.fsolve(). Some code that takes this approach:
import timeit
import numpy as np
from scipy.optimize import fsolve
def the_function(x, y):
return y * (1 + x)**(4.8) / x**(4.5) - 1
def the_derivative(x, y):
l_dh = x**(4.5) * (4.8 * y * (1 + x)**(3.8))
h_dl = y * (1 + x)**(4.8) * 4.5 * x**3.5
square_of_whats_below = x**9
return (l_dh - h_dl)/square_of_whats_below
print fsolve(the_function, x0=1, args=(0.1,))
print '\n\n'
print fsolve(the_function, x0=1, args=(0.1,), fprime=the_derivative)
%timeit fsolve(the_function, x0=1, args=(0.1,))
%timeit fsolve(the_function, x0=1, args=(0.1,), fprime=the_derivative)
...gives me this output:
[ 1.79308495]
[ 1.79308495]
10000 loops, best of 3: 105 µs per loop
10000 loops, best of 3: 136 µs per loop
which shows that analytical differentiation did not result in any speedup in this particular case. My guess is that the numerical approximation to the function involves easier-to-compute functions like multiplication, squaring, and/or addition, instead of functions like fractional exponentiation.
You can get additional simplification by taking the log of your equation and plotting it. With a little algebra, you should be able to obtain an explicit function for ln_y, the natural log of y. If I've done the algebra correctly:
def ln_y(x):
return 4.5 * np.log(x/(1.+x)) - 0.3 * np.log(1.+x)
You can plot this function, which I have done for both lin-lin and log-log plots:
%matplotlib inline
import matplotlib.pyplot as plt
x_axis = np.linspace(1, 100, num=2000)
f, ax = plt.subplots(1, 2, figsize=(8, 4))
ln_y_axis = ln_y(x_axis)
ax[0].plot(x_axis, np.exp(ln_y_axis)) # plotting y vs. x
ax[1].plot(np.log(x_axis), ln_y_axis) # plotting ln(y) vs. ln(x)
This shows that there are two values of x for every y as long as y is below a critical value. The minimum, singular value of y occurs when x=ln(15) and has y value of:
np.exp(ln_y(15))
0.32556278053267873
So your example y value of 1.03 results in no (real) solution for x.
This behavior we have discerned from the plots is recapitulated by the scipy.optimize.fsolve() call we made before:
print fsolve(the_function, x0=1, args=(0.32556278053267873,), fprime=the_derivative)
[ 14.99999914]
That shows that guessing x=1 initially, when y is 0.32556278053267873, gives x=15 as the solution. Trying larger y values:
print fsolve(the_function, x0=15, args=(0.35,), fprime=the_derivative)
results in an error:
/Users/curt/anaconda/lib/python2.7/site-packages/IPython/kernel/__main__.py:5: RuntimeWarning: invalid value encountered in power
The reason for the error is that the power function in Python (or numpy) do not accept negative bases for fractional exponents by default. You can fix that by supplying the powers as a complex number, i.e. write x**(4.5+0j) instead of x**4.5 but are you really interested in complex x values that would solve your equation?
I have an equation 'a*x+logx-b=0,(a and b are constants)', and I want to solve x. The problem is that I have numerous constants a(accordingly numerous b). How do I solve this equation by using python?
You could check out something like
http://docs.scipy.org/doc/scipy-0.13.0/reference/optimize.nonlin.html
which has tools specifically designed for these kinds of equations.
Cool - today I learned about Python's numerical solver.
from math import log
from scipy.optimize import brentq
def f(x, a, b):
return a * x + log(x) - b
for a in range(1,5):
for b in range(1,5):
result = brentq(lambda x:f(x, a, b), 1e-10, 20)
print a, b, result
brentq provides estimate where the function crosses the x-axis. You need to give it two points, one which is definitely negative and one which is definitely positive. For negative point choose number that is smaller than exp(-B), where B is maximum value of b. For positive point choose number that's bigger than B.
If you cannot predict range of b values, you can use a solver instead. This will probably produce a solution - but this is not guaranteed.
from scipy.optimize import fsolve
for a in range(1,5):
for b in range(1,5):
result = fsolve(f, 1, (a,b))
print a, b, result