I am using scipy.integrate.odeint to solve an initial value problem. Under some conditions, the underlying engine lsoda (a Fortran-implemented ODE solver) produces warning/error outputs. I want to suppress these outputs, but typical strategies for suppressing stdout or stderr don't work.
This issue only occurs if lsoda doesn't like the input. I realize that if the input was different the problem might not occur. Nevertheless, I want to suppress the output even with bad input!
Minimal Example –
import numpy as np
from scipy.integrate import odeint
def model(y,t):
return 1000 * y
# Pass impossible tolerances to induce an error
y = odeint(model, 5, np.linspace(0,20), rtol=1e-50, atol=1e-50)
Output:
lsoda-- at start of problem, too much accuracy
requested for precision of machine.. see tolsf (=r1)
in above message, r1 = 0.3700743415417D+37
If I try to filter by changing the odeint call to:
import os, contextlib
with open(os.devnull, 'w') as f:
with contextlib.redirect_stderr(f), contextlib.redirect_stdout(f):
y = odeint(model,y0,t, rtol=1e-50, atol=1e-50)
Output:
lsoda-- at start of problem, too much accuracy
requested for precision of machine.. see tolsf (=r1)
in above message, r1 = 0.3700743415417D+37
Relevant issue on the Scipy repo: https://github.com/scipy/scipy/issues/15940
I'm looking for any solution that enables the output to be silenced. This is a major paint point when working in an interactive session, especially when the output is another lsoda error:
lsoda-- warning..internal t (=r1) and h (=r2) are
such that in the machine, t + h = t on the next step
(h = step size). solver will continue anyway
Which often repeats many many times and clogs up the output.
Related
First of all, I tried to perform dimensionality reduction on my n_samples x 53 data using scikit-learn's Kernel PCA with precomputed kernel. The code worked without any issues when I tried using 50 samples at first. However, when I increased the number of samples into 100, suddenly I got the following message.
Process finished with exit code -1073740940 (0xC0000374)
Here's the detail of what I want to do:
I want to obtain the optimum value of kernel function hyperparameter in my Kernel PCA function, defined as the following.
from sklearn.decomposition.kernel_pca import KernelPCA as drm
from somewhere import costfunction
from somewhere_else import customkernel
def kpcafun(w,X):
# X is sample
# w is hyperparam
n_princomp = 2
drmodel = drm(n_princomp,kernel='precomputed')
k_matrix = customkernel (X,X,w)
transformed_x = drmodel.fit_transform(k_matrix)
cost = costfunction(transformed_x)
return cost
Therefore, to optimize the hyperparams I used the following code.
from scipy.optimize import minimize
# assume that wstart and optimbound are already defined
res = minimize(kpcafun, wstart, method='L-BFGS-B', bounds=optimbound, args=(X))
The strange thing is when I tried to debug the first 10 iterations of the optimization process, nothing strange has happened all values of the variables seemed normal. But, when I turned off the breakpoints and let the program continue the message appeared without any error notification.
Does anyone know what might be wrong with my code? Or anyone has some tips to resolve a problem like this?
Thanks
I am trying to solve a system of simultaneous equations as follows:
"145.0x/21025 = -0.334"
"(-48.402x-96.650y+96.650z)/21025 = -0.334"
"(-48.402x+132.070y+35.214z)/21025 = -0.334"
"sqrt(x^2+y^2+z^2) = 145.0"
I am using the following Python script:
from scipy.optimize import root
from numpy import sqrt
from sys import argv, stdout
initGuesses = eval(argv[1])
equations = argv[2:]
def f(variables):
x,y,z = variables
results = []
for eqn in equations:
results.append(eval(eqn))
return results
solution = root(f, initGuesses, method="lm")
stdout.write(str(solution["x"][0]) + "," + str(solution["x"][1]) + "," + str(solution["x"][2]))
stdout.flush()
The program is called as follows:
python3 SolvePosition3D.py "(1,1,1)" "(145.0*x+0.0*y+0.0*z)/21025.0+0.334" "(-48.402*x-96.650*y+96.650*z)/21025+0.334" "(-48.402*x+132.070*y+35.214*z)/21025+0.334" "sqrt(x**2+y**2+z**2)-145.0"
And I am receiving the following output:
48.2699997956,35.4758788666,132.042180583
This solution is wrong; the correct solution is roughly
-48,-35,-132
which is the same numbers but * -1.
The answer returned by the program satisfies the final equation, but violates all others.
Does anyone know why this is happening? Getting the correct solutions to these equations (and many others like them) is vitally important to my current project.
I was able to run the code via adding
from numpy import sqrt
from scipy.optimize import root
and switching to regular old prints for the solution.
Your starting guess seems to be wrong. Starting from (1, 1, 1), the root finder converges to 48.2699997956,35.4758788666,132.042180583, as you said. If you plug in (-1,-1,-1), you get -48.2649482763,-35.4698607274,-132.050694891 as expected.
As for why it does that, well, nonlinear systems are just hard to solve like that. Most algorithms tend to find solutions deterministically based on the starting point. If you need to try multiple starting points, try a grid-based search.
If I run the following code in python
from scipy.stats import norm, beta
sample = beta.rvs(2,5,size=100)
beta_fit = beta.fit(sample)
I get the following error
/usr/lib/python3/dist-packages/scipy/stats/_continuous_distns.py:404: RuntimeWarning: invalid
value encountered in sqrt
sk = 2*(b-a)*sqrt(a + b + 1) / (a + b + 2) / sqrt(a*b)
and depending on the size of the sample, I sometimes also get this other error
/usr/lib/python3/dist-packages/scipy/optimize/minpack.py:161: RuntimeWarning:
The iteration is not making good progress, as measured by the improvement from the last ten iterations.
warnings.warn(msg, RuntimeWarning)
Does anyone know why this is happening and how to fix it?
Thanks!
In a comment you say that you want to keep the support fixed as [0, 1]. To do that with the fit() method, use the arguments floc=0 and fscale=1. Then only the shape parameters will be fit to the data.
from scipy.stats import beta
sample = beta.rvs(2, 5, size=100)
beta_fit = beta.fit(sample, floc=0, fscale=1)
This should also eliminate the warnings that you are seeing. Those warnings occur because when all four parameters are fit, the code uses a generic numerical optimization routine to find the parameters that maximize the likelihood, and something in that code is generating those warnings. (It might be a bug--the shape parameters are supposed to be positive, so neither of the calls to sqrt in the line that generates the warning should get a negative argument.) When you fix the location and scale, the fit() method solves a simpler numerical problem to find the maximum likelihood parameter estimates, so it avoids the code that generates the warnings.
I have obviously read through the documentation, but I have not been able to find a more detailed description of what is happening under the covers. Specifically, there are a few behaviors that I am very confused about:
General setup
import numpy as np
from scipy.integrate import ode
#Constants in ODE
N = 30
K = 0.5
w = np.random.normal(np.pi, 0.1, N)
#Integration parameters
y0 = np.linspace(0, 2*np.pi, N, endpoint=False)
t0 = 0
#Set up the solver
solver = ode(lambda t,y: w + K/N*np.sum( np.sin( y - y.reshape(N,1) ), axis=1))
solver.set_integrator('vode', method='bdf')
solver.set_initial_value(y0, t0)
Problem 1: solver.integrate(t0) fails
Setting up the integrator, and asking for the value at t0 the first time returns a successful integration. Repeating this returns the correct number, but the solver.successful() method returns false:
solver.integrate(t0)
>>> array([ 0. , 0.20943951, 0.41887902, ..., 5.65486678,
5.86430629, 6.0737458 ])
solver.successful()
>>> True
solver.integrate(t0)
>>> array([ 0. , 0.20943951, 0.41887902, ..., 5.65486678,
5.86430629, 6.0737458 ])
solver.successful()
>>> False
My question is, what is happening in the solver.integrate(t) method that causes it to succeed the first time, and fail subsequently, and what does it mean to have an “unsuccessful” integration? Furthermore, why does the integrator fail silently, and continue to produce useful-looking outputs until I ask it explicitly whether it was successful?
Related, is there a way to reset the failed integration, or do I need to re-instantiate the solver from scratch?
Problem 2: solver.integrate(t) immediately returns an answer for almost any value of t
Even though my initial value of y0 is given at t0=0, I can request the value at t=10000 and get the answer immediately. I would expect that the numerical integration over such a large time span should take at least a few seconds (e.g. in Matlab, asking to integrate over 10000 time steps would take several minutes).
For example, re-run the setup from above and execute:
solver.integrate(10000)
>>> array([ 2153.90803383, 2153.63023706, 2153.60964064, ..., 2160.00982959,
2159.90446056, 2159.82900895])
Is Python really that fast, or is this output total nonsense?
Problem 0
Don’t ignore error messages. Yes, ode’s error messages can be cryptic at times, but you still want to avoid them.
Problem 1
As you already integrated up to t0 with the first call of solver.integrate(t0), you are integrating for a time step of 0 with the second call. This throws the cryptic error:
DVODE-- ISTATE (=I1) .gt. 1 but DVODE not initialized
In above message, I1 = 2
/usr/lib/python3/dist-packages/scipy/integrate/_ode.py:869: UserWarning: vode: Illegal input detected. (See printed message.)
'Unexpected istate=%s' % istate))
Problem 2.1
There is a maximum number of (internal) steps that a solver is going to take in one call without throwing an error. This can be set with the nsteps argument of set_integrator. If you integrate a large time at once, nsteps will be exceeded even if nothing is wrong, and the following error message is thrown:
/usr/lib/python3/dist-packages/scipy/integrate/_ode.py:869: UserWarning: vode: Excess work done on this call. (Perhaps wrong MF.)
'Unexpected istate=%s' % istate))
The integrator then stops at whenever this happens.
Problem 2.2
If you set nsteps=10**10, the integration runs without problems. It still is pretty fast though (roughly 1 s on my machine). The reason for this is as follows:
For a multi-dimensional system such as yours, there are two main runtime sinks when integrating:
Vector and matrix operations within the integrator. In scipy.ode, these are all realised with NumPy operations or ported Fortran or C code. Anyway, they are realised with compiled code without Python overhead and thus very efficient.
Evaluating the derivative (lambda t,y: w + K/N*np.sum( np.sin( y - y.reshape(N,1) ), axis=1) in your case). You realised this with NumPy operations, which again are realised with compiled code and very efficient. You may improve this a little bit with a purely compiled function, but that will grant you at most a small factor. If you used Python lists and loops instead, it would be horribly slow.
Therefore, for your problem, everything relevant is handled with compiled code under the hood and the integration is handled with an efficiency comparable to that of, e.g., a pure C program. I do not know how the two above aspects are handled in Matlab, but if either of the above challenges is handled with interpreted instead of compiled loops, this would explain the runtime discrepancy you observe.
To the second question, yes, the output might be nonsense. Local errors, be they from discretization or floating point operations, accumulate with a compounding factor which is about the Lipschitz constant of the ODE function. In a first estimate, the Lipschitz constant here is K=0.5. The magnification rate of early errors, that is, their coefficient as part of the global error, can thus be as large as exp(0.5*10000), which is a huge number.
On the other hand it is not surprising that the integration is fast. Most of the provided methods use step size adaptation, and with the standard error tolerances this might result in only some tens of internal steps. Reducing the error tolerances will increase the number of internal steps and may change the numerical result drastically.
I am trying to start using the AR models in statsmodels. However, I seem to be doing something wrong. Consider the following example, which fails:
from statsmodels.tsa.ar_model import AR
import numpy as np
signal = np.ones(20)
ar_mod = AR(signal)
ar_res = ar_mod.fit(4)
ar_res.predict(4, 60)
I think this should just continue the (trivial) time series consisting of ones. However, in this case it seems to return not enough parameters. len(ar_res.params) equals 4, while it should be 5. In the following example it works:
signal = np.ones(20)
signal[range(0, 20, 2)] = -1
ar_mod = AR(signal)
ar_res = ar_mod.fit(4)
ar_res.predict(4, 60)
I have the feeling that this could be a bug but I am not sure as I have no experience using the package. Maybe someone with more experience can help me...
EDIT: I have reported the issue here.
It works after adding a bit of noise, for example
signal = np.ones(20) + 1e-6 * np.random.randn(20)
My guess is that the constant is not added properly because of perfect collinearity with the signal.
You should open an issue to handle this corner case better. https://github.com/statsmodels/statsmodels/issues
My guess is also that the parameters are not identified in this case, so there might not be any good solution.
(Parameters not identified means that several parameter combinations can produce exactly the same fit, but I think they should all produce the same predictions in this case.)