Implementing derivatives of logs of data in python

Implementing derivatives of logs of data in python - python

We have two lists (vectors) of data, y and x, we can imagine x being time steps (0,1,2,...) and y some system property computed at value each value of x.
I'm interested in calculating the derivative of log of y with respect to log of x, and the question is how to perform such calculations in Python?
We can start off by using numpy to calculate the logs: logy = np.log(y) and logx = np.log(x). Then what method do we use for the differentiation dlog(y)/dlog(x)?
One option that comes to mind is using np.gradient() in the following way:
deriv = np.gradient(logy,np.gradient(logx)).
Is this a valid way of going about this calculation?
Are there better (or equivalent) alternatives without using np.gradient?

Right after looking at the source of np.gradient here and looking around you can see it changed in numpy version 1.14, hence why the docs change.
I have version 1.11. So I think that gradient is defined as def gradient(y, x) -> dy/dx if isinstance(x, np.ndarray) now but isn't in version 1.11. Doing np.gradient(y, np.array(...)) is actually, I think, undefined behaviour!
However, np.gradient(y) / np.gradient(x) works for all numpy versions. Use that!
Proof:
import numpy as np
import matplotlib.pyplot as plt
x = np.sort(np.random.random(10000)) * 2 * np.pi
y = np.sin(x)
dy_dx = np.gradient(y) / np.gradient(x)
plt.plot(x, dy_dx)
plt.show()
Looks an awful lot like a cos wave

Related

How do I obtain exact value from scipy interpolation/matplotlib? [duplicate]

I am trying to invert an interpolated function using scipy's interpolate function. Let's say I create an interpolated function,
import scipy.interpolate as interpolate
interpolatedfunction = interpolated.interp1d(xvariable,data,kind='cubic')
Is there some function that can find x when I specify a:
interpolatedfunction(x) == a
In other words, "I want my interpolated function to equal a; what is the value of xvariable such that my function is equal to a?"
I appreciate I can do this with some numerical scheme, but is there a more straightforward method? What if the interpolated function is multivalued in xvariable?

There are dedicated methods for finding roots of cubic splines. The simplest to use is the .roots() method of InterpolatedUnivariateSpline object:
spl = InterpolatedUnivariateSpline(x, y)
roots = spl.roots()
This finds all of the roots instead of just one, as generic solvers (fsolve, brentq, newton, bisect, etc) do.
x = np.arange(20)
y = np.cos(np.arange(20))
spl = InterpolatedUnivariateSpline(x, y)
print(spl.roots())
outputs array([ 1.56669456, 4.71145244, 7.85321627, 10.99554642, 14.13792756, 17.28271674])
However, you want to equate the spline to some arbitrary number a, rather than 0. One option is to rebuild the spline (you can't just subtract a from it):
solutions = InterpolatedUnivariateSpline(x, y - a).roots()
Note that none of this will work with the function returned by interp1d; it does not have roots method. For that function, using generic methods like fsolve is an option, but you will only get one root at a time from it. In any case, why use interp1d for cubic splines when there are more powerful ways to do the same kind of interpolation?
Non-object-oriented way
Instead of rebuilding the spline after subtracting a from data, one can directly subtract a from spline coefficients. This requires us to drop down to non-object-oriented interpolation methods. Specifically, sproot takes in a tck tuple prepared by splrep, as follows:
tck = splrep(x, y, k=3, s=0)
tck_mod = (tck[0], tck[1] - a, tck[2])
solutions = sproot(tck_mod)
I'm not sure if messing with tck is worth the gain here, as it's possible that the bulk of computation time will be in root-finding anyway. But it's good to have alternatives.

After creating an interpolated function interp_fn, you can find the value of x where interp_fn(x) == a by the roots of the function
interp_fn2 = lambda x: interp_fn(x) - a
There are number of options to find the roots in scipy.optimize. For instance, to use Newton's method with the initial value starting at 10:
from scipy import optimize
optimize.newton(interp_fn2, 10)
Actual example
Create an interpolated function and then find the roots where fn(x) == 5
import numpy as np
from scipy import interpolate, optimize
x = np.arange(10)
y = 1 + 6*np.arange(10) - np.arange(10)**2
y2 = 5*np.ones_like(x)
plt.scatter(x,y)
plt.plot(x,y)
plt.plot(x,y2,'k-')
plt.show()
# create the interpolated function, and then the offset
# function used to find the roots
interp_fn = interpolate.interp1d(x, y, 'quadratic')
interp_fn2 = lambda x: interp_fn(x)-5
# to find the roots, we need to supply a starting value
# because there are more than 1 root in our range, we need
# to supply multiple starting values. They should be
# fairly close to the actual root
root1, root2 = optimize.newton(interp_fn2, 1), optimize.newton(interp_fn2, 5)
root1, root2
# returns:
(0.76393202250021064, 5.2360679774997898)

If your data are monotonic you might also try the following:
inversefunction = interpolated.interp1d(data, xvariable, kind='cubic')

Mentioning another option because I found this page in a google search and the other option works for my simple use case. Hopefully it'll be of use to someone.
If the function you're interpolating is very simple and always has a 1:1 relationship between y and x, then you can simply take your data, swap x and y when you pass it into interp1d, and then call the interpolation function in that direction.
Adapting code from https://docs.scipy.org/doc/scipy/reference/generated/scipy.interpolate.interp1d.html
import numpy as np
import matplotlib.pyplot as plt
from scipy import interpolate
x = np.arange(0, 10)
y = np.exp(-x/3.0)
f = interpolate.interp1d(x, y)
xnew = np.arange(0, 9, 0.1)
ynew = f(xnew)
plt.plot(x, y, 'o', xnew, ynew, '-')
plt.show()
When x and y have been swapped you can call swappedInterpolationFunction(a) to get the x value where that would occur.
f = interpolate.interp1d(y, x)
xnew = np.arange(np.exp(-9/3), np.exp(0), 0.01)
ynew = f(xnew)
plt.plot(y, x, 'o', xnew, ynew, '-')
plt.title("Inverted")
plt.show()
Of course, if the function ever has multiple x values for a given y value (like sine or a parabola) then this will not work because it will no longer be a 1:1 function from x to y, and the above answers are necessary. This is just a simplification in a limited use case.

Scipy: efficiently generate a series of integration (integral function)

I have a function, I want to get its integral function, something like this:
That is, instead of getting a single integration value at point x, I need to get values at multiple points.
For example:
Let's say I want the range at (-20,20)
def f(x):
return x**2
x_vals = np.arange(-20, 21, 1)
y_vals =[integrate.nquad(f, [[0, x_val]]) for x_val in x_vals ]
plt.plot(x_vals, y_vals,'-', color = 'r')
The problem
In the example code I give above, for each point, the integration is done from scratch. In my real code, the f(x) is pretty complex, and it's a multiple integration, so the running time is simply too slow(Scipy: speed up integration when doing it for the whole surface?).
I'm wondering if there is any way of efficient generating the Phi(x), at a giving range.
My thoughs:
The integration value at point Phi(20) is calucation from Phi(19), and Phi(19) is from Phi(18) and so on. So when we get Phi(20), in reality we also get the series of (-20,-19,-18,-17 ... 18,19,20). Except that we didn't save the value.
So I'm thinking, is it possible to create save points for a integrate function, so when it passes a save point, the value would get saved and continues to the next point. Therefore, by a single process toward 20, we could also get the value at (-20,-19,-18,-17 ... 18,19,20)

One could implement the strategy you outlined by integrating only over the short intervals (between consecutive x-values) and then taking the cumulative sum of the results. Like this:
import numpy as np
import scipy.integrate as si
def f(x):
return x**2
x_vals = np.arange(-20, 21, 1)
pieces = [si.quad(f, x_vals[i], x_vals[i+1])[0] for i in range(len(x_vals)-1)]
y_vals = np.cumsum([0] + pieces)
Here pieces are the integrals over short intervals, which get summed to produce y-values. As written, this code outputs a function that is 0 at the beginning of the range of integration which is -20. One can, of course, subtract the y-value that corresponds to x=0 in order to have the same normalization as on your plot.
That said, the split-and-sum process is unnecessary. When you find an indefinite integral of f, you are really solving the differential equation F' = f. And SciPy has a built-in method for that, odeint. Just use it:
import numpy as np
import scipy.integrate as si
def f(x):
return x**2
x_vals = np.arange(-20, 21, 1)
y_vals = si.odeint(lambda y,x: f(x), 0, x_vals)
The output is essential identical to the first version (within tiny computational errors), with less code. The reason for using lambda y,x: f(x) is that the first argument of odeint must be a function taking two arguments, the right-hand side of the equation y' = f(y, x).

For the equivalent version of user3717023's answer using scipy's solve_ivp you need to keep in mind the different ordering of x and y in the function f (different from the odeint version).
Further, keep in mind that you can only compute the solution up to a constant. So you might want to shift the result according to some given condition. In the example here (with the function f(x)=x^2 as given by the OP), I shifted the numeric solution such that it goes through the origin, matching the simplest analytic solution F(x)=x^3/3.
import numpy as np
import matplotlib.pyplot as plt
from scipy.integrate import solve_ivp
def f(x):
return x**2
xs = np.linspace(-20, 20, 1001)
# This is the integration step:
sol = solve_ivp(lambda x, y: f(x), t_span=(xs[0], xs[-1]), y0=[0], t_eval=xs)
plt.plot(sol.t, sol.t**3/3, ls='-', c='C0', label="analytic: $F(x)=x^3/3$")
plt.plot(sol.t, sol.y[0], ls='--', c='C1', label="numeric solution")
plt.plot(sol.t, sol.y[0] - sol.y[0][sol.t.size//2], ls='-.', c='C3', label="shifted solution going through origin")
plt.legend()
In case you don't have an analytical version of the function f, but only xs and ys as data points, then you can use scipy's interp1d function to interpolate between the data points and pass on that interpolating function the same way as before:
from scipy.interpolate import interp1d
f = interp1d(xs, ys)

scipy.interp2d warning and large errors off the grid

I am trying to interpolate a 2-dimensional function and I am running into what I consider weird behavior by scipy.interpolate.interp2d. I don't understand what the problem is, and I'd be happy for any help or hints.
import numpy as np
from scipy.interpolate import interp2d
x = np.arange(10)
y = np.arange(20)
xx, yy = np.meshgrid(x, y, indexing = 'ij')
val = xx + yy
f = interp2d(xx, yy, val, kind = 'linear')
When I run this code, I get the following Warning:
scipy/interpolate/fitpack.py:981: RuntimeWarning: No more knots can be
added because the number of B-spline coefficients already exceeds the
number of data points m. Probable causes: either s or m too small.
(fp>s) kx,ky=1,1 nx,ny=18,15 m=200 fp=0.000000 s=0.000000
warnings.warn(RuntimeWarning(_iermess2[ierm][0] + _mess))
I don't understand why interp2d would use any splines when I tell it it should do linear interpolation. When I continue and evaluate f on the grid everything is good:
>>> f(1,1)
array([ 2.])
When I evaluate it off the grid, I get large errors, even though the function is clearly linear.
>>> f(1.1,1)
array([ 2.44361975])
I am a bit confused and I am not sure what the problem is. Did anybody run into similar problems? I used to work with matlab and this is almost 1:1 how I would do it there, but maybe I did something wrong.
When I use a rectangular grid (i.e. y = np.arange(10)) everything works fine by the way, but that isn't what I need. When I use cubic instead of linear interpolation, the error gets smaller (that doesn't make much sense either since the function is linear) but is still unacceptably large.

I tried a couple of things and managed to get (kind of) what I want using scipy.LinearNDInterpolator. However, I have to convert the grid to lists of points and values. Since the rest of my program stores coordinates and values in grid format that is kind of annoying, so if possible I'd still like to get the original code to work properly.
import numpy as np
import itertools
from scipy.interpolate import LinearNDInterpolator
x = np.arange(10)
y = np.arange(20)
coords = list(itertools.product(x,y))
val = [sum(c) for c in coords]
f = LinearNDInterpolator(coords, val)
>>>f(1,1)
array(2.0)
>>> f(1.1,1)
array(2.1)

How to plot grad(f(x,y))?

I want to calculate and plot a gradient of any scalar function of two variables. If you really want a concrete example, lets say f=x^2+y^2 where x goes from -10 to 10 and same for y. How do I calculate and plot grad(f)? The solution should be vector and I should see vector lines. I am new to python so please use simple words.
EDIT:
#Andras Deak: thank you for your post, i tried what you suggested and instead of your test function (fun=3*x^2-5*y^2) I used function that i defined as V(x,y); this is how the code looks like but it reports an error
import numpy as np
import math
import sympy
import matplotlib.pyplot as plt
def V(x,y):
t=[]
for k in range (1,3):
for l in range (1,3):
t.append(0.000001*np.sin(2*math.pi*k*0.5)/((4*(math.pi)**2)* (k**2+l**2)))
term = t* np.sin(2 * math.pi * k * x/0.004) * np.cos(2 * math.pi * l * y/0.004)
return term
return term.sum()
x,y=sympy.symbols('x y')
fun=V(x,y)
gradfun=[sympy.diff(fun,var) for var in (x,y)]
numgradfun=sympy.lambdify([x,y],gradfun)
X,Y=np.meshgrid(np.arange(-10,11),np.arange(-10,11))
graddat=numgradfun(X,Y)
plt.figure()
plt.quiver(X,Y,graddat[0],graddat[1])
plt.show()
AttributeError: 'Mul' object has no attribute 'sin'
And lets say I remove sin, I get another error:
TypeError: can't multiply sequence by non-int of type 'Mul'
I read tutorial for sympy and it says "The real power of a symbolic computation system such as SymPy is the ability to do all sorts of computations symbolically". I get this, I just dont get why I cannot multiply x and y symbols with float numbers.
What is the way around this? :( Help please!
UPDATE
#Andras Deak: I wanted to make things shorter so I removed many constants from the original formulas for V(x,y) and Cn*Dm. As you pointed out, that caused the sin function to always return 0 (i just noticed). Apologies for that. I will update the post later today when i read your comment in details. Big thanks!
UPDATE 2
I changed coefficients in my expression for voltage and this is the result:
It looks good except that the arrows point in the opposite direction (they are supposed to go out of the reddish dot and into the blue one). Do you know how I could change that? And if possible, could you please tell me the way to increase the size of the arrows? I tried what was suggested in another topic (Computing and drawing vector fields):
skip = (slice(None, None, 3), slice(None, None, 3))
This plots only every third arrow and matplotlib does the autoscale but it doesnt work for me (nothing happens when i add this, for any number that i enter)
You were already of huge help , i cannot thank you enough!

Here's a solution using sympy and numpy. This is the first time I use sympy, so others will/could probably come up with much better and more elegant solutions.
import sympy
#define symbolic vars, function
x,y=sympy.symbols('x y')
fun=3*x**2-5*y**2
#take the gradient symbolically
gradfun=[sympy.diff(fun,var) for var in (x,y)]
#turn into a bivariate lambda for numpy
numgradfun=sympy.lambdify([x,y],gradfun)
now you can use numgradfun(1,3) to compute the gradient at (x,y)==(1,3). This function can then be used for plotting, which you said you can do.
For plotting, you can use, for instance, matplotlib's quiver, like so:
import numpy as np
import matplotlib.pyplot as plt
X,Y=np.meshgrid(np.arange(-10,11),np.arange(-10,11))
graddat=numgradfun(X,Y)
plt.figure()
plt.quiver(X,Y,graddat[0],graddat[1])
plt.show()
UPDATE
You added a specification for your function to be computed. It contains the product of terms depending on x and y, which seems to break my above solution. I managed to come up with a new one to suit your needs. However, your function seems to make little sense. From your edited question:
t.append(0.000001*np.sin(2*math.pi*k*0.5)/((4*(math.pi)**2)* (k**2+l**2)))
term = t* np.sin(2 * math.pi * k * x/0.004) * np.cos(2 * math.pi * l * y/0.004)
On the other hand, from your corresponding comment to this answer:
V(x,y) = Sum over n and m of [Cn * Dm * sin(2pinx) * cos(2pimy)]; sum goes from -10 to 10; Cn and Dm are coefficients, and i calculated
that CkDl = sin(2pik)/(k^2 +l^2) (i used here k and l as one of the
indices from the sum over n and m).
I have several problems with this: both sin(2*pi*k) and sin(2*pi*k/2) (the two competing versions in the prefactor are always zero for integer k, giving you a constant zero V at every (x,y). Furthermore, in your code you have magical frequency factors in the trigonometric functions, which are missing from the comment. If you multiply your x by 4e-3, you drastically change the spatial dependence of your function (by changing the wavelength by roughly a factor of a thousand). So you should really decide what your function is.
So here's a solution, where I assumed
V(x,y)=sum_{k,l = 1 to 10} C_{k,l} * sin(2*pi*k*x)*cos(2*pi*l*y), with
C_{k,l}=sin(2*pi*k/4)/((4*pi^2)*(k^2+l^2))*1e-6
This is a combination of your various versions of the function, with the modification of sin(2*pi*k/4) in the prefactor in order to have a non-zero function. I expect you to be able to fix the numerical factors to your actual needs, after you figure out the proper mathematical model.
So here's the full code:
import sympy as sp
import numpy as np
import matplotlib.pyplot as plt
def CD(k,l):
#return sp.sin(2*sp.pi*k/2)/((4*sp.pi**2)*(k**2+l**2))*1e-6
return sp.sin(2*sp.pi*k/4)/((4*sp.pi**2)*(k**2+l**2))*1e-6
def Vkl(x,y,k,l):
return CD(k,l)*sp.sin(2*sp.pi*k*x)*sp.cos(2*sp.pi*l*y)
def V(x,y,kmax,lmax):
k,l=sp.symbols('k l',integers=True)
return sp.summation(Vkl(x,y,k,l),(k,1,kmax),(l,1,lmax))
#define symbolic vars, function
kmax=10
lmax=10
x,y=sp.symbols('x y')
fun=V(x,y,kmax,lmax)
#take the gradient symbolically
gradfun=[sp.diff(fun,var) for var in (x,y)]
#turn into bivariate lambda for numpy
numgradfun=sp.lambdify([x,y],gradfun,'numpy')
numfun=sp.lambdify([x,y],fun,'numpy')
#plot
X,Y=np.meshgrid(np.linspace(-10,10,51),np.linspace(-10,10,51))
graddat=numgradfun(X,Y)
fundat=numfun(X,Y)
hf=plt.figure()
hc=plt.contourf(X,Y,fundat,np.linspace(fundat.min(),fundat.max(),25))
plt.quiver(X,Y,graddat[0],graddat[1])
plt.colorbar(hc)
plt.show()
I defined your V(x,y) function using some auxiliary functions for transparence. I left the summation cut-offs as literal parameters, kmax and lmax: in your code these were 3, in your comment they were said to be 10, and anyway they should be infinity.
The gradient is taken the same way as before, but when converting to a numpy function using lambdify you have to set an additional string parameter, 'numpy'. This will alow the resulting numpy lambda to accept array input (essentially it will use np.sin instead of math.sin and the same for cos).
I also changed the definition of the grid from array to np.linspace: this is usually more convenient. Since your function is almost constant at integer grid points, I created a denser mesh for plotting (51 points while keeping your original limits of (-10,10) fixed).
For clarity I included a few more plots: a contourf to show the value of the function (contour lines should always be orthogonal to the gradient vectors), and a colorbar to indicate the value of the function. Here's the result:
The composition is obviously not the best, but I didn't want to stray too much from your specifications. The arrows in this figure are actually hardly visible, but as you can see (and also evident from the definition of V) your function is periodic, so if you plot the same thing with smaller limits and less grid points, you'll see more features and larger arrows.

Integral of data with time (Python)

I have a time series x(t) that is a NumPy array. My assignment tells me that I need to find the integral of this data with time.
How am I supposed to do this? It's not a function that I need to integrate, it's a list of data.

It depends on the statement of the problem. A rude approach would be something like this
import numpy as np
import scipy as sp
t = np.linspace(-1, 1, 100)
x = t*t
delta = t[1] - t[0]
I = sum(delta*x)

You can use Simpson's Rule. A routine that does that for you is simps in spicy.integrate.
>>> help(scipy.integrate.simps)
Help on function simps in module scipy.integrate.quadrature:
simps(y, x=None, dx=1, axis=-1, even='avg')
Integrate y(x) using samples along the given axis and the composite
Simpson's rule. If x is None, spacing of dx is assumed.
If there are an even number of samples, N, then there are an odd
number of intervals (N-1), but Simpson's rule requires an even number
of intervals. The parameter 'even' controls how this is handled.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Implementing derivatives of logs of data in python - python

Related

How do I obtain exact value from scipy interpolation/matplotlib? [duplicate]

Scipy: efficiently generate a series of integration (integral function)

scipy.interp2d warning and large errors off the grid

How to plot grad(f(x,y))?

Integral of data with time (Python)

Categories

Resources