While trying to create an example with scipy.optimize curve_fit I found that scipy seems to be incompatible with Python's math module. While function f1 works fine, f2 throws an error message.
from scipy.optimize import curve_fit
from math import sin, pi, log, exp, floor, fabs, pow
x_axis = np.asarray([pi * i / 6 for i in range(-6, 7)])
y_axis = np.asarray([sin(i) for i in x_axis])
def f1(x, m, n):
return m * x + n
coeff1, mat = curve_fit(f1, x_axis, y_axis)
print(coeff1)
def f2(x, m, n):
return m * sin(x) + n
coeff2, mat = curve_fit(f2, x_axis, y_axis)
print(coeff2)
The full traceback is
Traceback (most recent call last):
File "/Documents/Programming/Eclipse/PythonDevFiles/so_test.py", line 49, in <module>
coeff2, mat = curve_fit(f2, x_axis, y_axis)
File "/usr/local/lib/python3.5/dist-packages/scipy/optimize/minpack.py", line 742, in curve_fit
res = leastsq(func, p0, Dfun=jac, full_output=1, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/scipy/optimize/minpack.py", line 377, in leastsq
shape, dtype = _check_func('leastsq', 'func', func, x0, args, n)
File "/usr/local/lib/python3.5/dist-packages/scipy/optimize/minpack.py", line 26, in _check_func
res = atleast_1d(thefunc(*((x0[:numinputs],) + args)))
File "/usr/local/lib/python3.5/dist-packages/scipy/optimize/minpack.py", line 454, in func_wrapped
return func(xdata, *params) - ydata
File "/Documents/Programming/Eclipse/PythonDevFiles/so_test.py", line 47, in f2
return m * sin(x) + n
TypeError: only length-1 arrays can be converted to Python scalars
The error message appears with lists and numpy arrays as input alike. It affects all math functions, I tested (see functions in import) and must have something to do with, how the math module manipulates input data. This is most obvious with pow() function - if I don't import this function from math, curve_fit works properly with pow().
The obvious question - why does this happen and how can math functions be used with curve_fit?
P.S.: Please don't discuss, that one shouldn't fit the sample data with a linear fit. This was just chosen to illustrate the problem.
Be careful with numpy-arrays, operations working on arrays and operations working on scalars!
Scipy optimize assumes the input (initial-point) to be a 1d-array and often things go wrong in other cases (a list for example becomes an array and if you assumed to work on lists, things go havoc; those kind of problems are common here on StackOverflow and debugging is not that easy to do by the eye; code-interaction helps!).
import numpy as np
import math
x = np.ones(1)
np.sin(x)
> array([0.84147098])
math.sin(x)
> 0.8414709848078965 # this only works as numpy has dedicated support
# as indicated by the error-msg below!
x = np.ones(2)
np.sin(x)
> array([0.84147098, 0.84147098])
math.sin(x)
> TypeError: only size-1 arrays can be converted to Python scalars
To be honest: this is part of a very basic understanding of numpy and should be understood when using scipy's somewhat sensitive functions.
Related
I want to create a linear operator in python to solve Ax = b where A is a large-scale dense Matrix of float64. Since matrix A cause both performance and memory problems I thought about creating a customized operator as follows:
from numpy import ones
from numpy.linalg import inv
import scipy.sparse.linalg
from sklearn.datasets import make_spd_matrix
n = 100
def solver(A, b):
return inv(A).dot(b)
M = make_spd_matrix(n, random_state=11)
print(M.shape)
solverFunc = scipy.sparse.linalg.LinearOperator((n, n), matvec=solver)
solverFunc.matvec(M, ones((n, 1)))
However, I get the following error:
Traceback (most recent call last):
File "C:\Users\anoir\Desktop\CG_accelerator\inversion\main.py", line 15, in <module>
solverFunc = LinearOperator((n, n), matvec=solver)
File "C:\ProgramData\Anaconda3\envs\inversion\lib\site-packages\scipy\sparse\linalg\interface.py", line 521, in __init__
self._init_dtype()
File "C:\ProgramData\Anaconda3\envs\inversion\lib\site-packages\scipy\sparse\linalg\interface.py", line 178, in _init_dtype
self.dtype = np.asarray(self.matvec(v)).dtype
File "C:\ProgramData\Anaconda3\envs\inversion\lib\site-packages\scipy\sparse\linalg\interface.py", line 232, in matvec
y = self._matvec(x)
File "C:\ProgramData\Anaconda3\envs\inversion\lib\site-packages\scipy\sparse\linalg\interface.py", line 530, in _matvec
return self.__matvec_impl(x)
TypeError: solver() missing 1 required positional argument: 'b'
What seems to be the problem here? I followed the documentation but there is nothing about custom LinearOperator.
The linear operator only takes one parameter. You can get around this by using a closure as shown below:
from numpy.linalg import inv
import numpy as np
import scipy.sparse.linalg
from scipy.sparse import random
import timeit
n = 100
def solver_closure(A):
# This is the outer enclosing function
def solver(b):
return inv(A).dot(b)
return solver # returns the nested function
M = np.random.rand(n, n)
b = range(n)
print(M.shape)
solverFunc = scipy.sparse.linalg.LinearOperator((n, n), matvec=solver_closure(M))
def test100():
x = solverFunc.matvec(b)
print(np.matmul(M,x))
print(timeit.timeit("test100()", setup="from __main__ import test100",number=10))
Is it possible to concatenate scipy.optimize.curve_fit with scipy.optimize.bisect (or fsolve, or whatever) for implicit scalar functions?
In practice, have a look at this Python code where I try to define an implicit function and pass it to curve_fit to obtain the best fit for a parameter:
import numpy as np
import scipy.optimize as opt
import scipy.special as spc
# Estimate of initial parameter (not really important for this example)
fact, _, _, _ = spc.airy(-1.0188)
par0 = -np.log(2.0*fact*(18**(1.0/3.0))*np.pi*1e-6)
# Definition of an implicit parametric function f(c,t;b)=0
def func_impl(c, t, p) :
return ( c - ((t**3)/9.0) / ( np.log(t*(c**(1.0/3.0))) + p ) )
# definition of the function I believe should be passed to curve_fit
def func_egg(t, p) :
x_st, _ = opt.bisect( lambda x : func_impl(x, t, p), a=0.01, b=0.3 )
return x_st
# Some data points
t_data = np.deg2rad(np.array([95.0, 69.1, 38.8, 14.7]))
c_data = np.array([0.25, 0.10, 0.05, 0.01])
# Call to curve_fit
popt, pcov = opt.curve_fit(func_egg, t_data, c_data, p0=par0)
b = popt[0]
Now, I am aware of all the things that may go wrong when trying to automatically find roots (although bisection should be stable, provided there's a root between a and b); however, the error I get seems to concern the dimensionality of the output of func_impl:
Traceback (most recent call last):
File "example_fit.py", line 23, in <module>
popt, pcov = opt.curve_fit(func_egg, t_data, c_data, p0=par0)
File "/usr/local/lib/python3.7/site-packages/scipy/optimize/minpack.py", line 752, in curve_fit
res = leastsq(func, p0, Dfun=jac, full_output=1, **kwargs)
File "/usr/local/lib/python3.7/site-packages/scipy/optimize/minpack.py", line 383, in leastsq
shape, dtype = _check_func('leastsq', 'func', func, x0, args, n)
File "/usr/local/lib/python3.7/site-packages/scipy/optimize/minpack.py", line 26, in _check_func
res = atleast_1d(thefunc(*((x0[:numinputs],) + args)))
File "/usr/local/lib/python3.7/site-packages/scipy/optimize/minpack.py", line 458, in func_wrapped
return func(xdata, *params) - ydata
File "example_fit.py", line 15, in func_egg
x_st, _ = opt.bisect( lambda x : func_impl(x, t, p), a=0.01, b=0.3 )
File "/usr/local/lib/python3.7/site-packages/scipy/optimize/zeros.py", line 550, in bisect
r = _zeros._bisect(f, a, b, xtol, rtol, maxiter, args, full_output, disp)
File "example_fit.py", line 15, in <lambda>
x_st, _ = opt.bisect( lambda x : func_impl(x, t, p), a=0.01, b=0.3 )
File "example_fit.py", line 11, in func_impl
return ( c - ((t**3)/9.0) / ( np.log(t*(c**(1.0/3.0))) + p ) )
TypeError: only size-1 arrays can be converted to Python scalars
My guess is that curve_fit basically treats the output of the input function as a vector having the same dimensionality of the input data; I thought I could easily work around this by 'vectorizing' the implicit function, or func_egg, although it does not seem as trivial as I thought.
Am I missing something?
Is there a simple workaround?
I guess I end up answering my own question. I hope this could be useful to others.
Let's first choose a simpler implicit function, in this case, f(c,t;b)=c-b*t^3 (the reason will be clarified later):
import numpy as np
import scipy.optimize as opt
import scipy.special as spc
import matplotlib.pyplot as plt
# Definition of an implicit parametric function f(c,t;b)=0
def func_impl(c, t, p) :
return (c-p*t**3)
Let's vectorize it:
v_func_impl = np.vectorize(func_impl)
The same script as the one in the question, but now (1) func_egg is vectorized, and (2) I use newton instead of bisect (I found it easier to provide x0 instead of [a,b]):
# Definition of the function I believe should be passed to curve_fit
def func_egg(t, p) :
x_st = opt.newton( lambda x : func_impl(x, t, p), x0=0.05 )
return x_st
v_func_egg = np.vectorize(func_egg)
# Some data points
t_data = np.deg2rad(np.array([127.0, 95.0, 69.1, 38.8]))
c_data = np.array([0.6, 0.25, 0.10, 0.05])
# Call to curve_fit
par0 = 0.05
popt, pcov = opt.curve_fit(v_func_egg, t_data, c_data, p0=par0)
b = popt[0]
Now it works!
plt.plot(t_data, c_data)
plt.plot(np.linspace(0.5, 2.5), b*np.linspace(0.5, 2.5)**3)
plt.show()
So, in essence:
In order to concatenate scipy curve-fitting and root-finding one needs to ensure that each function is vectorized (or can deal with numpy arrays as input and output).
Make sure that your function is not 'too ugly', otherwise even if the concatenation works the root-finding procedure itself may not be able to find a result (this goes into numerical mathematics; I should have checked the regularity of my original function).
I recently ran into a question about integration and encountered a strange bug. I attempt a very simple problem using solve_ivp:
from scipy.integrate import solve_ivp
import numpy as np
def f(y, t):
return y
y0 = [1,1,1,1]
method = 'RK23'
s = solve_ivp(f, (0,1), y0, method=method, t_eval=np.linspace(0,1))
And it works fine. When I change to method='BDF' or method='Radau' I get an error:
Traceback (most recent call last):
File "<ipython-input-222-f11c4406e92c>", line 10, in <module>
s = solve_ivp(f, (0,1), y0, method=method, t_eval=np.linspace(0,1))
File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\integrate\_ivp\ivp.py", line 455, in solve_ivp
solver = method(fun, t0, y0, tf, vectorized=vectorized, **options)
File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\integrate\_ivp\radau.py", line 299, in __init__
self.jac, self.J = self._validate_jac(jac, jac_sparsity)
File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\integrate\_ivp\radau.py", line 345, in _validate_jac
J = jac_wrapped(t0, y0, self.f)
File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\integrate\_ivp\radau.py", line 343, in jac_wrapped
sparsity)
File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\integrate\_ivp\common.py", line 307, in num_jac
return _dense_num_jac(fun, t, y, f, h, factor, y_scale)
File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\integrate\_ivp\common.py", line 318, in _dense_num_jac
diff = f_new - f[:, None]
IndexError: too many indices for array
I also get an error with method = 'LSODA', although different (i.e. all implicit integrators). I do not get an error with any of the explicit integrators.
I tried this in spyder with scipy version 1.0.0 and in google colab (scipy version 1.1.0), with the same results.
Is this a bug or am I missing some argument I need for implicit integrators??
It appears that the Radau and BDF methods do not handle single-valued RHS functions. Making the function f above output a 1-D list solves your issue. Additionally, as mentioned by Weckesser in the comments, solve_ivp expects the RHS to be f(t, y) and not f(y, t).
Like this
def f(t, y):
return [y]
I’m having trouble using the bisect optimizer within scipy. Here are the relevant portions of my code:
How I’m importing things
import numpy as np
import scipy.optimize as sp
import matplotlib.pyplot as plt
Break in code, section causing errors below
#All variables are previously defined except for h
def BeamHeight(h):
x = 1000e3*M[i]*h/(fw*h^3-(fw-wt)(h-2*ft)^3) - Max_stress_steel
return x
for i in range(0,50):
h = np.zeros((50))
h[i] = sp.bisect(BeamHeight, hb, 5,xtol = 0.001)
Causing this error:
Traceback (most recent call last):
File "ShearMoment.py", line 63, in <module>
h[i] = sp.bisect(BeamHeight, hb, 5,xtol = 0.001)
File "/usr/lib/python2.7/dist-packages/scipy/optimize/zeros.py", line 248, in bisect
r = _zeros._bisect(f,a,b,xtol,rtol,maxiter,args,full_output,disp)
File "ShearMoment.py", line 58, in BeamHeight
x = 1000e3*M[i]*h/(fw*h^3-(fw-wt)(h-2*ft)^3) - Max_stress_steel
TypeError: 'float' object is not callable
I understand that scipy.optimize expects a function as one of its arguments. Am I doing this incorrectly?
In Python, concatenation is not implicitly multiplication, and ^ is not exponentiation. Multiplication must be made explicit with *, and exponentiation must be written as **. This part of BeamHeight:
fw*h^3-(fw-wt)(h-2*ft)^3
must be written as
fw*h**3-(fw-wt)*(h-2*ft)**3
I was trying to fit a specific function with scipy and I got weird results. I decided to test something I know the answer to so I created this:
from scipy.optimize import curve_fit as cf
import numpy as np
import random
def func(x,a):
return a+X
X =[]
for i in range (10):
V = random.random()
X.append(i+3 + V/10)
print cf(func, np.array(range(10)),np.array(X))
I expected to get something around 3, nevertheless, here the output:
(array([ -2.18158824e-12]), inf)
As a side note, I tried to see what I send something to func and I got this:
print func(np.array(range(10)),3)
Traceback (most recent call last):
File "/tmp/py1759O-P", line 16, in <module>
print func(np.array(range(10)),3)
File "/tmp/py1759O-P", line 6, in func
return a+X
TypeError: unsupported operand type(s) for +: 'int' and 'list
What am I doing wrong?
Don't use x and X as variable names when they carry such different meanings (or perhaps you didn't know Python is case sensitive?):
def func(x,a):
return a+X
X =[]
x is a numpy array, X is a list, and a is a scalar parameter value.
a+X results in an error since you can not add a scalar to a list.
In func, the argument is x, but X is used in the body of the function.
Here's a modified version of your code. It uses a few more features of numpy (e.g. np.random.random() instead of random.random()).
from scipy.optimize import curve_fit as cf
import numpy as np
def func(x, a):
return a + x
n = 10
xdata = np.arange(n)
ydata = func(xdata, 3) + np.random.random(n) / 10
print cf(func, xdata, ydata)
The output is
(array([ 3.04734293]), array([[ 8.19208558e-05]]))