Trying to infer model parameter of Lotka-Volterra model - python

def derivative(X, t, A, B, C, D):
x, y = X
dotx = x * (A - B * y)
doty = y * (-D + C * x)
return np.array([dotx, doty])
def integration(t,A,B,C,D,X0):
res = odeint(derivative, X0, t, args = (A,B,C,D))
return res
X0 = [30, 4]
X = array([[30. , 4. ],
[47.2, 6.1],
[70.2, 9.8],
[77.4, 35.2],
[36.3, 59.4],
[20.6, 41.7],
[18.1, 19. ],
[21.4, 13. ],
[22. , 8.3],
[25.4, 9.1],
[27.1, 7.4],
[40.3, 8. ],
[57. , 12.3],
[76.6, 19.5],
[52.3, 45.7],
[19.5, 51.1],
[11.2, 29.7],
[ 7.6, 15.8],
[14.6, 9.7],
[16.2, 10.1],
[24.7, 8.6]])
t = [0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 11.0, 12.0, 13.0, 14.0, 15.0, 16.0, 17.0, 18.0, 19.0, 20.0]
XData = t
YData = X
curve_fit(integration,XData,YData)
So X is my data, the first column is species x, and second column is species y.
I tried to infer parameters for this Lotka-Volterra model using ode and curve fit.
The error says not enough values to unpack (expected 2, got 1)
I am actually not even sure whether I should infer parameter this way.
Can anyone help me with this, are there any better methods of infering parameters.
Thanks in advance!

Note that ydata is required to be a flat array. While it is strongly suggested that xdata contains one input value or vector per element of ydata, there is no requirement for it. xdata is a constant that could also have been passed some other way. It is there just for the convenience in standard regression tasks.
Thus it is also no problem to have ydata twice as long as xdata. Just apply .flatten() to the 2-dimensional arrays.
Next, the parameter list has to be a list of scalars, so add Y0 and pass the initial vector [X0,Y0].
Together these corrections lead to a result. Which is not very convincing.
I got a better result, but still not perfect, in using a multiple shooting approach, taking the points in X[:-1] and integrating for the time step 1, comparing the collected list of end-points to X[1:]. This works better in finding parameters that match amplitude and frequency, but produces a slight speed difference that looks better with a 3% correction of the coefficients.
One would probably need a mix of both approaches to get the local as well as global characteristics respected.
And indeed it works, giving parameters
A,B = 0.5215206964006734, 0.02567364947581818
C,D = 0.02493663631623848, 0.8476224408838039
X0,Y0 = 34.53872014350661, 4.653177640949391
Code for that complex fitting program: For the residual computation, first encapsulate the solver to avoid repetition of solver parameters. Then use that to first integrate over the full interval with the variable initial point, and then over the time step 1 segments.
def solver(XY,t,para):
return odeint(derivative, XY, t, args = para, atol=1e-8, rtol=1e-11)
def integration(XY_arr,*para):
XY0 = para[4:]
para = para[:4]
T = np.arange(len(XY_arr))
res0 = solver(XY0,T, para)
res1 = [ solver(XY,[t,t+1],para)[-1]
for t,XY in enumerate(XY_arr[:-1]) ]
return np.concatenate([res0,res1]).flatten()
This obviously needs the reference array prepared in a similar fashion
XData = X
YData = np.concatenate([ X,X[1:]]).flatten()
p0 =[ 0.5215, 0.02567,
0.02493, 0.8476,
34.53, 4.653]
After that the curve fitting procedure call remains the same, all changes happened before
params, info = curve_fit(integration,XData,YData,p0=p0)
XY0, para = params[4:], params[:4]
print(XY0,tuple(para))
t_plot = np.linspace(0,len(X),500)
x_plot = solver(XY0, t_plot, tuple(para))

Related

Interpolate image at specific coordinates

Given an image (array) in rectangular form, how do I interpolate specific pixel positions? The following code produces as 20x30 grid, with each pixel filled with a value (zg). The code then constructs an interpolator with scipy's interp2d method. What I want is to obtain interpolated values at specific coordinates. In the given example, at x = [1.5, 2.4, 5.8], y = [0.5, 7.2, 2.2], so for a total of 3 positions. However, the function returns a 3x3 array for some reason. Why? And how would I change the code so that only these three coordinates would be evaluated?
import numpy as np
from scipy.interpolate import interp2d
# Rectangular grid
x = np.arange(20)
y = np.arange(30)
xg, yg = np.meshgrid(x, y)
zg = np.exp(-(2*xg)**2 - (yg/2)**2)
# Define interpolator
interp = interp2d(yg, xg, zg)
# Interpolate pixel value
zi = interp([1.5, 2.4, 5.8], [0.5, 7.2, 2.2])
print(zi.shape) # = (3, 3)
Your code is fine. The interp interpolation function is computing all the possible combinations of coordinates, i.e. 3 × 3 = 9. For instance:
>>> interp(1.5, 0.5)
array([0.04635516])
>>> interp(1.5, 7.2)
array([0.02152198])
>>> interp(5.8, 2.2)
array([0.03073694])
>>> interp(2.4, 2.2)
array([0.03810408])
Indeed you can find these values in the returned matrix:
>>> interp([1.5, 2.4, 5.8], [0.5, 7.2, 2.2])
array([[0.04635516, 0.04409826, 0.03557219],
[0.0400542 , 0.03810408, 0.03073694],
[0.02152198, 0.02047414, 0.01651562]])
The documentation states that the return value is a
2-D array with shape (len(y), len(x))
If you just want the coordinates you need, you can do the following:
xe = [1.5, 2.4, 5.8]
ye = [0.5, 7.2, 2.2]
>>> [interp(x, y)[0] for x, y in zip(xe, ye)]
[0.04635515780224686, 0.020474138863349815, 0.030736938802464715]

Dividing circumference into equal parts and returning coordinates

I have created several circles with different origins using Python and I am trying to implement a function that will divide each circle into n number of equal parts along the circumference. I am trying to populate an array that contains the starting [x,y] coordinate for each part on the circumference.
My code is as follows:
def fnCalculateArcCoordinates(self,intButtonCount,radius,center):
lstButtonCoord = []
#for degrees in range(0,360,intAngle):
for arc in range(1,intButtonCount + 1):
degrees = arc * 360 / intButtonCount
xDegreesCoord = int(center[0] + radius * math.cos(math.radians(degrees)))
yDegreesCoord = int(center[1] + radius * math.sin(math.radians(degrees)))
lstButtonCoord.append([xDegreesCoord,yDegreesCoord])
return lstButtonCoord
When I run the code for 3 parts, an example of the set of coordinates that are returned are:
[[157, 214], [157, 85], [270, 149]]
This means the segments are of different sizes. Could someone please help me identify where my error is?
The exact results of such trigonometric calculations are rarely exact integers. By flooring them to int, you lose some precision, of course. The approximate (Pythagorean) distance checks suggest that your math is correct:
(270-157)**2 + (149-85)**2
# 16865
(270-157)**2 + (214-149)**2
# 16994
(157-157)**2 + (214-85)**2
# 16641
Furthermore, you can use the built-in complex number type and the cmath module. In particular cmath.rect converts polar coordinates (a radius and an angle) into rectangular coordinates:
import cmath
def calc(count, radius, center):
x, y = center
for i in range(count):
r = cmath.rect(radius, (2*cmath.pi)*(i/count))
yield [round(x+r.real, 2), round(y+r.imag, 2)]
list(calc(4, 2, [0, 0]))
# [[2.0, 0.0], [0.0, 2.0], [-2.0, 0.0], [-0.0, -2.0]]
list(calc(6, 1, [0, 0]))
# [[1.0, 0.0], [0.5, 0.87], [-0.5, 0.87], [-1.0, 0.0], [-0.5, -0.87], [0.5, -0.87]]
You want to change rounding as you see fit.

Optimization of numpy mesh creation for efficient interpolation

I am reading magnetic field data from a text file. My goal is to correctly and efficiently load the mesh points (in 3 dimensions) and the associated fields (for simplicity I will assume below that I have a scalar field).
I managed to make it work, however I feel that some steps might not be necessary. In particular, reading the numpy doc it might be that "broadcasting" would be able to work its magic to my advantage.
import numpy as np
from scipy import interpolate
# Loaded from a text file, here the sampling over each dimension is identical but it is not required
x = np.array([-1.0, -0.5, 0.0, 0.5, 1.0])
y = np.array([-1.0, -0.5, 0.0, 0.5, 1.0])
z = np.array([-1.0, -0.5, 0.0, 0.5, 1.0])
# Create a mesh explicitely
mx, my, mz = np.meshgrid(x, y, z, indexing='ij') # I have to switch from 'xy' to 'ij'
# These 3 lines seem odd
mx = mx.reshape(np.prod(mx.shape))
my = my.reshape(np.prod(my.shape))
mz = mz.reshape(np.prod(mz.shape))
# Loaded from a text file
field = np.random.rand(len(mx))
# Put it all together
data = np.array([mx, my, mz, field]).T
# Interpolate
interpolation_points = np.array([[0, 0, 0]])
interpolate.griddata(data[:, 0:3], data[:, 3], interpolation_points, method='linear')
Is it really necessary to construct the mesh like this? Is it possible to make it more efficient?
Here's one with broadcasted-assignment to generate data directly from x,y,z and hence avoid the memory overhead of creating all the mesh-grids and hopefully lead to better performance -
m,n,r = len(x),len(y),len(z)
out = np.empty((m,n,r,4))
out[...,0] = x[:,None,None]
out[...,1] = y[:,None]
out[...,2] = z
out[...,3] = np.random.rand(m,n,r)
data_out = out.reshape(-1,out.shape[-1])

Least squares function and 4 parameter logistics function not working

Relatively new to python, mainly using it for plotting things. I am currently attempting to determine a best fit line using the 4 parameter logistic (4PL) equation and curve fit from scipy. There are one or two sites showing how 4PL works, but could not get them to work for my data. Example, but similar 4PL data below:
import numpy as np
import matplotlib.pyplot as plt
from scipy import stats
import scipy.optimize as optimization
xdata = [2.3, 2.3, 2, 2, 1.7, 1.7, 1, 1, 0.000001, 0.000001, -1, -1]
ydata = [0.32, 0.3, 0.55, 0.60, 0.88, 0.92, 1.27, 1.21, 1.15, 1.12, 1.1, 1.1]
def fourPL(x, A, B, C, D):
return ((A-D)/(1.0+((x/C)**(B))) + D)
guess = [0, -0.5, 0.5, 1]
params, params_covariance = optimization.curve_fit(fourPL, xdata, ydata,
guess)
params
Gives warning (also an exponent warning in test data, but not real):
OptimizeWarning: Covariance of the parameters could not be estimated
category=OptimizeWarning)
And the params returns my initial guess. I have tried various initial guesses.
The best fit line is drawn when plotting, but is not a curve and does not go below x = 0 (I cannot find a reason negatives would mess with the 4PL model).
4PL fit plotted
I'm not sure if I am doing something incorrect with the equation, or how the curve fit function works, or both. I have a similar issue using least squares instead of curve fit. I've tried a bunch of variations based off similar equations for fit etc. but have been stuck for awhile, any help in pointing me in the right direction would be much appreciated.
I'm surprised you did not get any warnings or did not share them with us. I can't analyze this task for you by scientific means, just some remarks about technical stuff:
Observation
When running your code, you should some warnings like:
RuntimeWarning: invalid value encountered in power
return ((A-D)/(1.0+((x/C)**(B))) + D)
Don't ignore this!
Debugging
Add some prints to your function fourPL, probably all the different components of your function and look what's happening.
Example:
def fourPL(x, A, B, C, D):
print('1: ', (A-D))
print('2: ', (x/C))
print('3: ', (1.0+((x/C)**(B))))
return ((A-D)/(1.0+((x/C)**(B))) + D)
...
params, params_covariance = optimization.curve_fit(fourPL, xdata, ydata, guess, maxfev=1)
# maxfev=1 -> let's just check 1 or few it's
Output:
1: -1.0
2: [ 4.60000000e+00 4.60000000e+00 4.00000000e+00 4.00000000e+00
3.40000000e+00 3.40000000e+00 2.00000000e+00 2.00000000e+00
2.00000000e-06 2.00000000e-06 -2.00000000e+00 -2.00000000e+00]
RuntimeWarning: invalid value encountered in power
print('3: ', (1.0+((x/C)**(B))))
3: [ 1.4662524 1.4662524 1.5 1.5 1.54232614
1.54232614 1.70710678 1.70710678 708.10678119 708.10678119
nan nan]
That's enough to stop. nans and infs are bad!
Theory
Now it's time for theory and i won't do that. But usually you now should think about the underlying theory and why these problems occur.
Is there something you missed in regards to the assumptions?
Repair (without checking theory)
Without checking out the theory and just looking over some example found within 30 secs: hmm are negative x-values a problem?
Let's shift x (by the minimum; hardcoded 1 here):
xdata = np.array([2.3, 2.3, 2, 2, 1.7, 1.7, 1, 1, 0.000001, 0.000001, -1, -1]) + 1
Complete code:
import numpy as np
import matplotlib.pyplot as plt
from scipy import stats
import scipy.optimize as optimization
xdata = np.array([2.3, 2.3, 2, 2, 1.7, 1.7, 1, 1, 0.000001, 0.000001, -1, -1]) + 1
ydata = np.array([0.32, 0.3, 0.55, 0.60, 0.88, 0.92, 1.27, 1.21, 1.15, 1.12, 1.1, 1.1])
def fourPL(x, A, B, C, D):
return ((A-D)/(1.0+((x/C)**(B))) + D)
guess = [0, -0.5, 0.5, 1]
params, params_covariance = optimization.curve_fit(fourPL, xdata, ydata, guess)#, maxfev=1)
x_min, x_max = np.amin(xdata), np.amax(xdata)
xs = np.linspace(x_min, x_max, 1000)
plt.scatter(xdata, ydata)
plt.plot(xs, fourPL(xs, *params))
plt.show()
Output:
RuntimeWarning: divide by zero encountered in power
return ((A-D)/(1.0+((x/C)**(B))) + D)
Looks good, but it's time for another theory session: what did our linear-shift do to our results? I'm ignoring this again.
So just one warning and a nice-looking output.
If you want to remove that last warning, add some small epsilon to not have 0's in xdata:
xdata = np.array([2.3, 2.3, 2, 2, 1.7, 1.7, 1, 1, 0.000001, 0.000001, -1, -1]) + 1 + 1e-10
which will achieve the same, without any warning.

numpy linalg.lstsq with big values

I'm using linalg.lstsq to build a regression line inside a function like this:
def lsreg(x, y):
if not isinstance(x, np.ndarray):
x = np.array(x)
if not isinstance(y, np.ndarray):
y = np.array(y)
A = np.array([x, np.ones(len(x))])
ret = np.linalg.lstsq(A.T, y)
return ret[0]
and calling it like this:
x = np.array([10000001, 10000002, 10000003])
y = np.array([3.0, 4.0, 5.0])
regress = lsreg(x, y)
fit = regress[0]*x + regress[1]
print fit
and the output y get is:
[ 3. 4. 5.]
So far, so good. Now, if I change x like this:
x = np.array([100000001, 100000002, 100000003])
y = np.array([3.0, 4.0, 5.0])
regress = lsreg(x, y)
fit = regress[0]*x + regress[1]
print fit
I get
[ 3.99999997 4.00000001 4.00000005]
instead of something close to 3, 4 and 5.
Any clue on what is going on ?
Your problem is due to numerical errors that occur when solving an ill-conditioned system of equations.
In [115]: np.linalg.lstsq(A.T, y)
Out[115]:
(array([ 3.99999993e-08, 3.99999985e-16]),
array([], dtype=float64),
1,
array([ 1.73205084e+08, 1.41421352e-08]))
Notice that np.linalg.lstsq returned "1" for the rank of the matrix AA.T formed from your input matrix. This means it thinks your matrix is rank 1 and hence is ill-conditioned (as your least square system is a 2 x 2 system of equations it should be rank 2). The second singular value which is close to 0 confirms this. This is the reason for the "wrong" result. You should google along the lines of "numerical linear algebra numerical errors" to learn more about this problem.
I tried with scipy:
from scipy import stats
x = np.array([100000001, 100000002, 100000003])
y = np.array([3.0, 4.0, 5.0])
res = stats.linregress(x, y)
print x*res[0] + res[1]
and I get:
[ 3. 4. 5.]

Categories

Resources