I'm working with Python/numpy/scipy to write a small ray tracer. Surfaces are modelled as two-dimensional functions giving a height above a normal plane. I reduced the problem of finding the point of intersection between ray and surface to finding the root of a function with one variable. The functions are continuous and continuously differentiable.
Is there a way to do this more efficiently than simply looping over all the functions, using scipy root finders (and maybe using multiple processes)?
Edit: The functions are the difference between a linear function representing the ray and the surface function, constrained to a plane of intersection.
The following example shows calculating the roots for 1 million copies of the function x**(a+1) - b (all with different a and b) in parallel using the bisection method. Takes about ~12 seconds here.
import numpy
def F(x, a, b):
return numpy.power(x, a+1.0) - b
N = 1000000
a = numpy.random.rand(N)
b = numpy.random.rand(N)
x0 = numpy.zeros(N)
x1 = numpy.ones(N) * 1000.0
max_step = 100
for step in range(max_step):
x_mid = (x0 + x1)/2.0
F0 = F(x0, a, b)
F1 = F(x1, a, b)
F_mid = F(x_mid, a, b)
x0 = numpy.where( numpy.sign(F_mid) == numpy.sign(F0), x_mid, x0 )
x1 = numpy.where( numpy.sign(F_mid) == numpy.sign(F1), x_mid, x1 )
error_max = numpy.amax(numpy.abs(x1 - x0))
print "step=%d error max=%f" % (step, error_max)
if error_max < 1e-6: break
The basic idea is to simply run all the usual steps of a root finder in parallel on a vector of variables, using a function that can be evaluated on a vector of variables and equivalent vector(s) of parameters that define the individual component functions. Conditionals are replaced with a combination of masks and numpy.where(). This can continue until all roots have been found to the required precision, or alternately until enough roots have been found that it is worth to remove them from the problem and continue with a smaller problem that excludes those roots.
The functions I chose to solve are arbitrary, but it helps if the functions are well-behaved; in this case all functions in the family are monotonic and have exactly one positive root. Additionally, for the bisection method we need guesses for the variable that give different signs of the function, and those happen to be quite easy to come up with here as well (the initial values of x0 and x1).
The above code uses perhaps the simplest root finder (bisection), but the same technique could be easily applied to Newton-Raphson, Ridder's, etc. The fewer conditionals there are in a root finding method, the better suited it is to this. However, you will have to reimplement any algorithm you want, there is no way to use an existing library root finder function directly.
The above code snippet is written with clarity in mind, not speed. Avoiding the repetition of some calculations, in particular evaluating the function only once per iteration instead of 3 times, speeds this up to 9 seconds, as follows:
...
F0 = F(x0, a, b)
F1 = F(x1, a, b)
max_step = 100
for step in range(max_step):
x_mid = (x0 + x1)/2.0
F_mid = F(x_mid, a, b)
mask0 = numpy.sign(F_mid) == numpy.sign(F0)
mask1 = numpy.sign(F_mid) == numpy.sign(F1)
x0 = numpy.where( mask0, x_mid, x0 )
x1 = numpy.where( mask1, x_mid, x1 )
F0 = numpy.where( mask0, F_mid, F0 )
F1 = numpy.where( mask1, F_mid, F1 )
...
For comparison, using scipy.bisect() to find one root at a time takes ~94 seconds:
for i in range(N):
x_root = scipy.optimize.bisect(lambda x: F(x, a[i], b[i]), x0[i], x1[i], xtol=1e-6)
Sometime in the past few years, scipy.optimize.newton gained vectorization support. Using the example from the other answer would now look like:
import numpy as np
from scipy import optimize
def F(x, a, b):
return np.power(x, a+1.0) - b
N = 1000000
a = np.random.rand(N)
b = np.random.rand(N)
optimize.newton(F, np.zeros(N), args=(a, b))
This runs just as fast as the the vectorized bisection method in the other answer.
Related
I usually rely on Wolfram Mathematica for this kind of thing, but I've been delving into Python recently and the results are so much better. Basically, I'm looking for numerical solutions for systems like the following.
system:
Well, I know that there are solutions, because Wolfram Mathematica found a single one (0.0858875,0.0116077,-0.156661,1.15917). What I tried to do in Python is this brute force code.
import numpy as np
START = -3
END = 3
STEP = 0.1
for r0 in np.arange(START, END, STEP):
for r1 in np.arange(START, END, STEP):
for r2 in np.arange(START, END, STEP):
for r3 in np.arange(START, END, STEP):
eq0 = r0*r2+r1*r3
eq1 = r0*r1+r1*r2+r2*r3+r0*r3
eq2 = r0**2+r1**2+r2**2+r3**2-4*(r0+r1+r2+r3)**2
if (eq0 == 0 and eq1 < 0 and eq2 < 0):
print(r0, r1, r2, r3)
Edit: I'm okay with things like -0.00001< eq0 < 0.00001 instead of eq1 == 0
Well, although it didn't find solutions in this case, the brute force method went well for other systems I'm dealing with, particularly when there are fewer equations and variables. Starting with four variables, it becomes really difficult.
I'm sorry if I'm asking too much. I'm completely new to Python, so I also don't know if this is actually trivial. Maybe fsolve would be useful? I'm not sure if it works with inequalities. Also, even when the systems I encounter have only equalities, they always have more variables than equations, like this one:
system2:
.
Hence 'fsolve' is not appropriate, right?
As soon as your system contains inequalities, you need to formulate it as an optimization problem and solve it with scipy.optimize.minimize. Otherwise, you can use scipy.optimize.root or scipy.optimize.fsolve to solve an equation system. Note that the former is also exactly what is done behind the scenes in root and fsolve, i.e. both solve a least-squares optimization problem.
In general, the problem
g_1(x) = 0, ..., g_m(x) = 0
h_1(x) < 0, ..., h_p(x) < 0
can be formulated as
min g_1(x)**2 + ... + g_m(x)**2
s.t. -1.0*(h_1(x) + eps) >= 0
.
.
-1.0*(h_p(x) + eps) >= 0
where eps is a tolerance to model the strict inequality.
Hence, you can solve your first problem as follows:
import numpy as np
from scipy.optimize import minimize
def obj(r):
return (r[0]*r[2]+r[1]*r[3])**2
eps = 1.0e-6
constrs = [
{'type': 'ineq', 'fun': lambda r: -1.0*(r[0]*r[1] + r[1]*r[2] + r[2]*r[3] + r[0]*r[3] + eps)},
{'type': 'ineq', 'fun': lambda r: -1.0*(np.sum(r**2) - 4*(np.sum(r))**2 + eps)}
]
# res.x contains the solution
res = minimize(obj, x0=np.ones(4), constraints=constrs)
Your second problem can be solved similarly. Here, you only need to remove the constraints. Alternatively, you can use root where it's worth mentioning that it solves F(x) = 0 for a function F: R^N -> R^N, i.e. a function of N variables that returns an N dimensional vector. In case your function has fewer equations than variables, you can simply fill up the vector with zeros:
import numpy as np
from scipy.optimize import root
def F(r):
vals = np.zeros(r.size)
vals[0] = np.dot(r[:5], r[1:]) + r[0]*r[5]
vals[1] = r[0]*r[3] + r[1]*r[4] + r[2]*r[5]
vals[2] = np.sum(r**2) - 3*np.sum(r)**2
return vals
# res.x contains your solution
res = root(F, x0=np.ones(6))
Not really an answer, but you can simplify this a lot using product
from itertools import product
import numpy as np
START = -3
END = 3
STEP = 0.1
for r0, r1, r2, r3 in product(np.arange(START, END, STEP), repeat=4):
print(r0, r1, r2, r3)
Not sure if your problem is a root finding problem or a minimization with constraints problem, but take a look at scipy.optimize and scipy.linprog, maybe one of those methods can be bent to your application.
I am trying to calculate the second-order autocorrelation function
g(lag) = < P2 (m(t) . m(t+lag)) >
for a given numpy array m where P2 is the second-order Legendre polynomial P2(x) = 0.5 * (3*x**2-1). The average is calculated over all initial points t for all possible lag times lag.
I have written a simple function to calculate this function g(lag)
def calc_tau_fast_build(vector):
maxLag = numpy.int(np.floor(len(vector)/2))
g = numpy.zeros((maxLag));
for tau in range(0, maxLag):
w = 0;
tmp = 0;
for t in range(0, len(vector) - tau):
theta = vector[t] * vector[tau + t]
tmp = tmp + 0.5*(3*theta**2-1)
w = w + 1
g[tau] = tmp/w
return g
This works just fine, however for large inputs the performance is rather poor.
I would like to replace the entire function with numpy.correlate if possible. In fact, without the P2 = 0.5*(3*theta**2-1) part, I can get the same results, upto a normalization constant, with
res = numpy.correlate(vector,vector,mode='full')
res[numpy.int(res.size/2):]
I basically couldn't think of a way to incorporate the P2 function into the numpy.correlate without playing around with the numpy source code. Could anyone please suggest a way to use numpy.correlate in this context?
I need to plot a graphic showing 2 variables, with a second order ODE with RK4, so far i've done this
from numpy import arange
from pylab import plot,xlabel,ylabel,show
Qger = 400
K = 20
T1 = 150
T2 = 60
N = 1000
h = (T2-T1)/N
rpoints = arange(6.0,8.0,h)
xpoints = []
x = 423
def df(s,t):
dTdt = -Qger*t/(2*K) + 172.8/t
return dTdt
for r in rpoints:
xpoints.append(x)
k1 = h*df(x,r)
k2 = h*df(x+0.5*k1,r+0.5*h)
k3 = h*df(x+0.5*k2,r+0.5*h)
k4 = h*df(x+k3,r+h)
x += (k1+2*k2+2*k3+k4)/6
pylab.plot(rpoints,xpoints)
pylab.xlabel("Raio")
pylab.ylabel("Temperatura")
pylab.show
But that's a RK4 for a first order ODE, because i didn't know and integrated by
hand, but i can't do that and neither use scipy, so can anyone explain to me how to integrate this function or use RK4 with a second order ODE. The function is below.
This is the function, only T and r are variables, the rest is 0
You should be able to put the above in a "semi-discrete" form, thats to say dT/dt in terms of only partial derivatives with respect r. If you can then find a numerical or otherwise approximation to the terms equivalent to dT/dt, i.e. the RHS of dT/dt= df(r,...) then explicit RK4 can be applicable.
In this approach, your time stepping method (RK4), is only applied to your first order derivative of temperature with respect to time.
I have two programs, one that can take in N coupled ODEs and one that uses 2 coupled ODEs. In the case that I input the 2 same ODEs into both codes, with the same time span, I get different answers. I know the correct answer, so I can deduce that my N many program is wrong.
Here is the code for the 2 equation dedicated one:
# solve the coupled system dy/dt = f(y, t)
def f(y, t):
"""Returns the collections of first-order
coupled differential equations"""
#v11i = y[0]
#v22i = y[1]
#v12i = y[2]
print y[0]
# the model equations
f0 = dHs(tRel,vij)[0].subs(v12,y[2])
f1 = dHs(tRel,vij)[3].subs(v12,y[2])
f2 = dHs(tRel,vij)[1].expand().subs([(v11,y[0]),(v22,y[1]),(v12,y[2])])
return [f0, f1, f2]
# Initial conditions for graphing
v110 = 6
v220 = 6
v120 = 4
y0 = [v110, v220, v120] # initial condition vector
sMesh = np.linspace(0, 1, 10e3) # time grid
# Solve the DE's
soln = odeint(f, y0, sMesh)
and here is the N equation dedicated one:
def f(y, t):
"""Returns the derivative of H_s with initial
values plugged in"""
# the model equations
print y[0]
for i in range (0,len(dh)):
for j in range (0,len(y)):
dh[i] = dh[i].subs(v[j],y[j])
dhArray = []
for i in range(0,len(dh)):
dhArray.append(dh[i])
return dhArray
sMesh = np.linspace(0, 1, 10e3) # time grid
dh = dHsFunction(t, V_s).expand()
soln = odeint(f, v0, sMesh)
where dHs(tRel,vij) = dHsFunction(t,V_s) i.e. the exact same ODEs. Similarly y0 and v0 are the exact same. But when I print y[0] in the N many case, I get an output of:
6.0
5.99999765602
5.99999531204
5.97655553477
5.95311575749
5.92967598021
5.69527820744
5.46088043467
5.2264826619
2.88250493418
0.53852720647
-1.80545052124
-25.2452277984
-48.6850050755
-72.1247823527
-306.522555124
as opposed to the 2 dedicated case of:
6.0
5.99999765602
5.99999765602
5.99999531205
5.99999531205
5.98848712729
5.98848712125
5.97702879748
5.97702878476
5.96562028875
5.96562027486
5.91961750442
5.91961733611
5.93039037809
5.93039029335
5.89564277275
5.89564273736
5.86137647436
5.86137638807
5.82758984835
etc.
where the second result is the correct one and graphs the proper graphs.
Please let me know if more code is needed or anything else. Thanks.
Your second version for f modifies the value of the global variable dh.
On the first call, you substitute in values in it, and these same values are then used in all subsequent calls.
Avoid that by using e.g. dh_tmp = list(dh) inside the function.
consider my code
a,b,c = np.loadtxt ('test.dat', dtype='double', unpack=True)
a,b, and c are the same array length.
for i in range(len(a)):
q[i] = 3*10**5*c[i]/100
x[i] = q[i]*math.sin(a)*math.cos(b)
y[i] = q[i]*math.sin(a)*math.sin(b)
z[i] = q[i]*math.cos(a)
I am trying to find all the combinations for the difference between 2 points in x,y,z to iterate this equation (xi-xj)+(yi-yj)+(zi-zj) = r
I use this combination code
for combinations in it.combinations(x,2):
xdist = (combinations[0] - combinations[1])
for combinations in it.combinations(y,2):
ydist = (combinations[0] - combinations[1])
for combinations in it.combinations(z,2):
zdist = (combinations[0] - combinations[1])
r = (xdist + ydist +zdist)
This takes a long time for python for a large file I have and I am wondering if there is a faster way to get my array for r preferably using a nested loop?
Such as
if i in range(?):
if j in range(?):
Since you're apparently using numpy, let's actually use numpy; it'll be much faster. It's almost always faster and usually easier to read if you avoid python loops entirely when working with numpy, and use its vectorized array operations instead.
a, b, c = np.loadtxt('test.dat', dtype='double', unpack=True)
q = 3e5 * c / 100 # why not just 3e3 * c?
x = q * np.sin(a) * np.cos(b)
y = q * np.sin(a) * np.sin(b)
z = q * np.cos(a)
Now, your example code after this doesn't do what you probably want it to do - notice how you just say xdist = ... each time? You're overwriting that variable and not doing anything with it. I'm going to assume you want the squared euclidean distance between each pair of points, though, and make a matrix dists with dists[i, j] equal to the distance between the ith and jth points.
The easy way, if you have scipy available:
# stack the points into a num_pts x 3 matrix
pts = np.hstack([thing.reshape((-1, 1)) for thing in (x, y, z)])
# get squared euclidean distances in a matrix
dists = scipy.spatial.squareform(scipy.spatial.pdist(pts, 'sqeuclidean'))
If your list is enormous, it's more memory-efficient to not use squareform, but then it's in a condensed format that's a little harder to find specific pairs of distances with.
Slightly harder, if you can't / don't want to use scipy:
pts = np.hstack([thing.reshape((-1, 1)) for thing in (x, y, z)])
sqnorms = np.sum(pts ** 2, axis=1)
dists = sqnorms.reshape((-1, 1)) - 2 * np.dot(pts, pts.T) + sqnorms
which basically implements the formula (a - b)^2 = a^2 - 2 a b + b^2, but all vector-like.
Apologies for not posting a full solution, but you should avoid nesting calls to range(), as it will create a new tuple every time it gets called. You are better off either calling range() once and storing the result, or using a loop counter instead.
For example, instead of:
max = 50
for number in range (0, 50):
doSomething(number)
...you would do:
max = 50
current = 0
while current < max:
doSomething(number)
current += 1
Well, the complexity of your calculation is pretty high. Also, you need to have huge amounts of memory if you want to store all r values in a single list. Often, you don't need a list and a generator might be enough for what you want to do with the values.
Consider this code:
def calculate(x, y, z):
for xi, xj in combinations(x, 2):
for yi, yj in combinations(y, 2):
for zi, zj in combinations(z, 2):
yield (xi - xj) + (yi - yj) + (zi - zj)
This returns a generator that computes only one value each time you call the generator's next() method.
gen = calculate(xrange(10), xrange(10, 20), xrange(20, 30))
gen.next() # returns -3
gen.next() # returns -4 and so on