I'm writing a program that is for a friend of mine that is currently studying Aeronautical Engineering. I'm trying to test if the math I've implemented works. For those who know, I'm trying to calculate the divergence (I think I'm not an engineer and I'm not going to pretend that I am).
He sent me a Stack overflow link to a how he thinks this should be done. (The thread can be found here. His version doesn't work for me as it gives me a Numpy error as seen below:
numpy.core._internal.AxisError: axis 1 is out of bounds for array of
dimension 1
Now I've tried a different method that gives me a different error as seen below:
ValueError: operands could not be broadcast together with shapes (60,58)
(60,59)
This method gives me the error above and I'm not entirely sure how to fix it. I've put the code that gives me the above error.
velocity = np.diff(c_flow)/np.diff(zex)
ucom = velocity.real
vcom = -(velocity.imag)
deltau = np.divide((np.diff(ucom)),(np.diff(x)))
deltav = np.divide((np.diff(vcom)),np.diff(y))
print(deltau + deltav)
Note: C_flow is defined earlier in the program and is the complex potential. zex is also defined earlier as an early form of the complex variable. x and y are two coordinate matrices from coordinate vectors.
The expected results from the print statement should be zero or a value that is very close to zero. (I'm not entirely sure what the value should be but as I've said, I'm not an engineer)
Thank you in advance
EDIT:
After following BenT's advice I used np.gradient and np.sum but this was adding the axis in the wrong direction so to counteract this the I separated the two functions as seen below:
velocity = np.diff(c_flow)/np.diff(z)
grad = (np.gradient(velocity))
divergence = np.sum(grad, axis=0)
print(np.average(divergence))
print(np.average(velocity))
I am trying to run a tsne analysis on a square distance matrix. These are the commands I am using.
model = TSNE(n_components = 2,perplexity = 32, verbose = 10,n_iter = 1000, metric = "precomputed")
embeddings = model.fit_transform(D)
This is the output I receive: output from tsne function
It looks like the program is running through 75 iterations and then calling it good and quitting. When I plot the data from the tsne it's basically just a single dense blob. Why is the program quitting early and how can I make it run longer?
It's quitting because the exit-condition is reached.
Interpreting the log, the exit-condition is probably a metric on the gradient, called gradient-norm here. If needed, checkout the basics of gradient-descent to understand the intuition. As every iteration is making a step towards the negative of the gradient, tiny gradients will not do much to the objective (and will be interpreted as: we found a local/global minimum).
It looks like (still interpreting your log only):
if np.linalg.norm(gradient) < 1e-4:
return solution
There is no merit to further do more iterations for this parameterization of the optimization-problem. The solution won't get better (in terms of minimization).
You can only try other parameters (resulting in other optimization-problems).
I have obviously read through the documentation, but I have not been able to find a more detailed description of what is happening under the covers. Specifically, there are a few behaviors that I am very confused about:
General setup
import numpy as np
from scipy.integrate import ode
#Constants in ODE
N = 30
K = 0.5
w = np.random.normal(np.pi, 0.1, N)
#Integration parameters
y0 = np.linspace(0, 2*np.pi, N, endpoint=False)
t0 = 0
#Set up the solver
solver = ode(lambda t,y: w + K/N*np.sum( np.sin( y - y.reshape(N,1) ), axis=1))
solver.set_integrator('vode', method='bdf')
solver.set_initial_value(y0, t0)
Problem 1: solver.integrate(t0) fails
Setting up the integrator, and asking for the value at t0 the first time returns a successful integration. Repeating this returns the correct number, but the solver.successful() method returns false:
solver.integrate(t0)
>>> array([ 0. , 0.20943951, 0.41887902, ..., 5.65486678,
5.86430629, 6.0737458 ])
solver.successful()
>>> True
solver.integrate(t0)
>>> array([ 0. , 0.20943951, 0.41887902, ..., 5.65486678,
5.86430629, 6.0737458 ])
solver.successful()
>>> False
My question is, what is happening in the solver.integrate(t) method that causes it to succeed the first time, and fail subsequently, and what does it mean to have an “unsuccessful” integration? Furthermore, why does the integrator fail silently, and continue to produce useful-looking outputs until I ask it explicitly whether it was successful?
Related, is there a way to reset the failed integration, or do I need to re-instantiate the solver from scratch?
Problem 2: solver.integrate(t) immediately returns an answer for almost any value of t
Even though my initial value of y0 is given at t0=0, I can request the value at t=10000 and get the answer immediately. I would expect that the numerical integration over such a large time span should take at least a few seconds (e.g. in Matlab, asking to integrate over 10000 time steps would take several minutes).
For example, re-run the setup from above and execute:
solver.integrate(10000)
>>> array([ 2153.90803383, 2153.63023706, 2153.60964064, ..., 2160.00982959,
2159.90446056, 2159.82900895])
Is Python really that fast, or is this output total nonsense?
Problem 0
Don’t ignore error messages. Yes, ode’s error messages can be cryptic at times, but you still want to avoid them.
Problem 1
As you already integrated up to t0 with the first call of solver.integrate(t0), you are integrating for a time step of 0 with the second call. This throws the cryptic error:
DVODE-- ISTATE (=I1) .gt. 1 but DVODE not initialized
In above message, I1 = 2
/usr/lib/python3/dist-packages/scipy/integrate/_ode.py:869: UserWarning: vode: Illegal input detected. (See printed message.)
'Unexpected istate=%s' % istate))
Problem 2.1
There is a maximum number of (internal) steps that a solver is going to take in one call without throwing an error. This can be set with the nsteps argument of set_integrator. If you integrate a large time at once, nsteps will be exceeded even if nothing is wrong, and the following error message is thrown:
/usr/lib/python3/dist-packages/scipy/integrate/_ode.py:869: UserWarning: vode: Excess work done on this call. (Perhaps wrong MF.)
'Unexpected istate=%s' % istate))
The integrator then stops at whenever this happens.
Problem 2.2
If you set nsteps=10**10, the integration runs without problems. It still is pretty fast though (roughly 1 s on my machine). The reason for this is as follows:
For a multi-dimensional system such as yours, there are two main runtime sinks when integrating:
Vector and matrix operations within the integrator. In scipy.ode, these are all realised with NumPy operations or ported Fortran or C code. Anyway, they are realised with compiled code without Python overhead and thus very efficient.
Evaluating the derivative (lambda t,y: w + K/N*np.sum( np.sin( y - y.reshape(N,1) ), axis=1) in your case). You realised this with NumPy operations, which again are realised with compiled code and very efficient. You may improve this a little bit with a purely compiled function, but that will grant you at most a small factor. If you used Python lists and loops instead, it would be horribly slow.
Therefore, for your problem, everything relevant is handled with compiled code under the hood and the integration is handled with an efficiency comparable to that of, e.g., a pure C program. I do not know how the two above aspects are handled in Matlab, but if either of the above challenges is handled with interpreted instead of compiled loops, this would explain the runtime discrepancy you observe.
To the second question, yes, the output might be nonsense. Local errors, be they from discretization or floating point operations, accumulate with a compounding factor which is about the Lipschitz constant of the ODE function. In a first estimate, the Lipschitz constant here is K=0.5. The magnification rate of early errors, that is, their coefficient as part of the global error, can thus be as large as exp(0.5*10000), which is a huge number.
On the other hand it is not surprising that the integration is fast. Most of the provided methods use step size adaptation, and with the standard error tolerances this might result in only some tens of internal steps. Reducing the error tolerances will increase the number of internal steps and may change the numerical result drastically.
I'm working on some python code to interpolate irregular data onto a 180° lat x 360° lon spherical grid. The code is currently hanging when I call the following:
def geo_interp(lats,lons,data,grid_size_deg):
deg2rad = pi/180.
new_lats = np.linspace(grid_size_deg, 180, 180) * deg2rad
new_lons = np.linspace(grid_size_deg, 360, 360) * deg2rad
knotst, knotsp = new_lats.copy(), new_lons.copy()
knotst[0] += .0001
knotst[-1] -= .0001
knotsp[0] += .0001
knotsp[-1] -= .0001
lut = LSQSphereBivariateSpline(lats.ravel(),lons.ravel(),data.T.ravel(),knotst,knotsp)
data_interp = lut(new_lats, new_lons)
return data_interp
The arrays I'm using as arguments when I call the above subroutine all fit the requirements of LSQSphereBivariateSpline as listed in the documentation. When I run it, it takes much longer than I feel like it should take to process a 180x360 dataset.
When I run the script with python -m trace --trace, the last line of output before nothing happens for a long time is
fitpack2.py(1025): w=w, eps=eps)
As far as I can tell, line 1025 of fitpack2.py is in a comment, which is even more confusing.
So my questions are:
1. Is there a way to tell if it's hanging or just very slow?
2. If it's hanging, how might I fix it?
The only thing I can think of is that I have no idea what I'm doing as far as choosing knots. Is there a good way to choose those? I just went with the grid I'll be interpolating to later, since the example in the doc seemed to be an arbitrary grid.
UPDATE: It finally finished after about 3 hours but the "interpolated data" looked like random noise. Also, if this is relevant, as far as I can tell LSQSphereBivariateSpline is the only function I can use for this because my lats and lons are not strictly increasing.
Also, I should add that when it finished it output the following warning: Warning (from warnings module):
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/scipy/interpolate/fitpack2.py", line 1029
warnings.warn(message)
UserWarning:
WARNING. The coefficients of the spline returned have been computed as the
minimal norm least-squares solution of a (numerically) rank
deficient system (deficiency=16336, rank=48650). Especially if the rank
deficiency, which is computed by 6+(nt-8)*(np-7)+ier, is large,
the results may be inaccurate. They could also seriously depend on
the value of eps.
SOLUTION: I had far too many knots, causing both the glacial pace and the useless results.
I am trying to start using the AR models in statsmodels. However, I seem to be doing something wrong. Consider the following example, which fails:
from statsmodels.tsa.ar_model import AR
import numpy as np
signal = np.ones(20)
ar_mod = AR(signal)
ar_res = ar_mod.fit(4)
ar_res.predict(4, 60)
I think this should just continue the (trivial) time series consisting of ones. However, in this case it seems to return not enough parameters. len(ar_res.params) equals 4, while it should be 5. In the following example it works:
signal = np.ones(20)
signal[range(0, 20, 2)] = -1
ar_mod = AR(signal)
ar_res = ar_mod.fit(4)
ar_res.predict(4, 60)
I have the feeling that this could be a bug but I am not sure as I have no experience using the package. Maybe someone with more experience can help me...
EDIT: I have reported the issue here.
It works after adding a bit of noise, for example
signal = np.ones(20) + 1e-6 * np.random.randn(20)
My guess is that the constant is not added properly because of perfect collinearity with the signal.
You should open an issue to handle this corner case better. https://github.com/statsmodels/statsmodels/issues
My guess is also that the parameters are not identified in this case, so there might not be any good solution.
(Parameters not identified means that several parameter combinations can produce exactly the same fit, but I think they should all produce the same predictions in this case.)