Related
I am trying to write a program that uses an array in further calculations. I initialize a grid of equally spaced points with NumPy and assign a value at each point as per the code snippet provided below. The function I am trying to describe with this array gives me a division by 0 error at x=y and it generally blows up around it. I know that the real part of said function is bounded by band_D/(2*math.pi)
at x=y and I tried manually assigning this value on the diagonal, but it seems that points around it are still ill-behaved and so I am not getting any right values. Is there a way to remedy this? This is how the function looks like with matplotlib
gamma=5
band_D=100
Dt=1e-3
x = np.arange(0,1/gamma,Dt)
y = np.arange(0,1/gamma,Dt)
xx,yy= np.meshgrid(x,y)
N=x.shape[0]
di = np.diag_indices(N)
time_fourier=(1j/2*math.pi)*(1-np.exp(1j*band_D*(xx-yy)))/(xx-yy)
time_fourier[di]=band_D/(2*math.pi)
You have a classic 0 / 0 problem. It's not really Numpy's job to figure out to apply De L'Hospital and solve this for you... I see, as other have commented, that you had the right idea with trying to set the limit value at the diagonal (where x approx y), but by the time you'd hit that line, the warning had already been emitted (just a warning, BTW, not an exception).
For a quick fix (but a bit of a fudge), in this case, you can try to add a small value to the difference:
xy = xx - yy + 1e-100
num = (1j / 2*np.pi) * (1 - np.exp(1j * band_D * xy))
time_fourier = num / xy
This also reveals that there is something wrong with your limit calculation... (time_fourier[0,0] approx 157.0796..., not 15.91549...).
and not band_D / (2*math.pi).
For a correct calculation:
def f(xy):
mask = xy != 0
limit = band_D * np.pi/2
return np.where(mask, np.divide((1j/2 * np.pi) * (1 - np.exp(1j * band_D * xy)), xy, where=mask), limit)
time_fourier = f(xx - yy)
You are dividing by x-y, that will definitely throw an error when x = y. The function being well behaved here means that the Taylor series doesn't diverge. But python doesn't know or care about that, it just calculates one step at a time until it reaches division by 0.
You had the right idea by defining a different function when x = y (ie, the mathematically true answer) but your way of applying it doesn't work because the correction is AFTER the division by 0, so it never gets read. This, however, should work
def make_time_fourier(x, y):
if np.isclose(x, y):
return band_D/(2*math.pi)
else:
return (1j/2*math.pi)*(1-np.exp(1j*band_D*(x-y)))/(x-y)
time_fourier = np.vectorize(make_time_fourier)(xx, yy)
print(time_fourier)
You can use np.divide with where option.
import math
gamma=5
band_D=100
Dt=1e-3
x = np.arange(0,1/gamma,Dt)
y = np.arange(0,1/gamma,Dt)
xx,yy = np.meshgrid(x,y)
N = x.shape[0]
di = np.diag_indices(N)
time_fourier = (1j / 2 * np.pi) * (1 - np.exp(1j * band_D * (xx - yy)))
time_fourier = np.divide(time_fourier,
(xx - yy),
where=(xx - yy) != 0)
time_fourier[di] = band_D / (2 * np.pi)
You can reformulate your function so that the division is inside the (numpy) sinc function, which handles it correctly.
To save typing I'll use D for band_D and use a variable
z = D*(xx-yy)/2
Then
T = (1j/2*pi)*(1-np.exp(1j*band_D*(xx-yy)))/(xx-yy)
= (2/D)*(1j/2*pi)*( 1 - cos( 2*z) - 1j*sin( 2*z))/z
= (1j/D*pi)* (2*sin(z)*sin(z) - 2j*sin(z)*cos(z))/z
= (2j/D*pi) * sin(z)/z * (sin(z) - 1j*cos(z))
= (2j/D*pi) * sinc( z/pi) * (sin(z) - 1j*cos(z))
numpy defines
sinc(x) to be sin(pi*x)/(pi*x)
I can't run python do you should chrck my calculations
The steps are
Substitute the definition of z and expand the complex exp
Apply the double angle formulae for sin and cos
Factor out sin(z)
Substitute the definition of sinc
I have the code below for fft2 performed by numpy and a 2d fft performed by direct code. an anyone point out why they are different? My inputmatreix is rA.
def DFT_matrix(N):
i, j = np.meshgrid(np.arange(N), np.arange(N))
omega = np.exp( - 2 * math.pi * 1J / N )
W = np.power( omega, i * j ) / np.sqrt(N)
return W
sizeM=40
rA=np.random.rand(sizeM,sizeM)
rAfft=np.fft.fft2(rA)
rAfftabs=np.abs(rAfft)+1e-9
dftMtx=DFT_matrix(sizeM)
dftR=dftMtx.conj().T
mA=dftMtx*rA*dftR
mAabs=np.abs(mA)+1e-9
print(np.allclose(mAabs, rAfftabs))
There are a few problems with your implementation.
1. DFT Matrix formula
First of all, as explained here, the formula for computing the DFT X of a MxN signal x is defined as:
Since you are computing the DFT for a MxM input, you just need to compute the DFT_Matrix once. Also note that due to the way W is defined, there is no need for conjugation and since W is symmetric and unitary there is no need for any transpose.
2. Matrix multiplication
When it comes to actually multiplying the matrixes together, you have to make sure to use the matrix multiplication operator # instead of the element wise multiplier *
3. DFT_matrix normalization
By default the fft functions don't normalize your output. This means that before comparing the two outputs, you either have to divide the np.fft.fft2 result by sqrt(M*M) = M or you drop the np.sqrt(N) in your DFT_matrix function.
Summary:
Her is your example with the appropriate fixes for a MxN input. At the end, the magnitudes and angles are compared.
import numpy as np
def DFT_matrix(N):
i, j = np.meshgrid(np.arange(N), np.arange(N))
omega = np.exp( - 2 * np.pi * 1j / N )
W = np.power( omega, i * j ) # Normalization by sqrt(N) Not included
return W
sizeM=40
sizeN=20
np.random.seed(0)
rA=np.random.rand(sizeM,sizeN)
rAfft=np.fft.fft2(rA)
dftMtxM=DFT_matrix(sizeM)
dftMtxN=DFT_matrix(sizeN)
# Matrix multiply the 3 matrices together
mA = dftMtxM # rA # dftMtxN
print(np.allclose(np.abs(mA), np.abs(rAfft)))
print(np.allclose(np.angle(mA), np.angle(rAfft)))
Both checks should evaluate to True. However note that the complexity of this algorithm, assuming M=N is N³ while the library's fft2 brings that down to N²log(N)!
The problem is that I would like to be able to integrate the differential equations starting for each point of the grid at once instead of having to loop over the scipy integrator for each coordinate. (I'm sure there's an easy way)
As background for the code I'm trying to solve the trajectories of a Couette flux alternating the direction of the velocity each certain period, that is a well known dynamical system that produces chaos. I don't think the rest of the code really matters as the part of the integration with scipy and my usage of the meshgrid function of numpy.
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.animation import FuncAnimation, writers
from scipy.integrate import solve_ivp
start_T = 100
L = 1
V = 1
total_run_time = 10*3
grid_points = 10
T_list = np.arange(start_T, 1, -1)
x = np.linspace(0, L, grid_points)
y = np.linspace(0, L, grid_points)
X, Y = np.meshgrid(x, y)
condition = True
totals = np.zeros((start_T, total_run_time, 2))
alphas = np.zeros(start_T)
i = 0
for T in T_list:
alphas[i] = L / (V * T)
solution = np.array([X, Y])
for steps in range(int(total_run_time/T)):
t = steps*T
if condition:
def eq(t, x):
return V * np.sin(2 * np.pi * x[1] / L), 0.0
condition = False
else:
def eq(t, x):
return 0.0, V * np.sin(2 * np.pi * x[1] / L)
condition = True
time_steps = np.arange(t, t + T)
xt = solve_ivp(eq, time_steps, solution)
solution = np.array([xt.y[0], xt.y[1]])
totals[i][t: t + T][0] = solution[0]
totals[i][t: t + T][1] = solution[1]
i += 1
np.save('alphas.npy', alphas)
np.save('totals.npy', totals)
The error given is :
ValueError: y0 must be 1-dimensional.
And it comes from the 'solve_ivp' function of scipy because it doesn't accept the format of the numpy function meshgrid. I know I could run some loops and get over it but I'm assuming there must be a 'good' way to do it using numpy and scipy. I accept advice for the rest of the code too.
Yes, you can do that, in several variants. The question remains if it is advisable.
To implement a generally usable ODE integrator, it needs to be abstracted from the models. Most implementations do that by having the state space a flat-array vector space, some allow a vector space engine to be passed as parameter, so that structured vector spaces can be used. The scipy integrators are not of this type.
So you need to translate the states to flat vectors for the integrator, and back to the structured state for the model.
def encode(X,Y): return np.concatenate([X.flatten(),Y.flatten()])
def decode(U): return U.reshape([2,grid_points,grid_points])
Then you can implement the ODE function as
def eq(t,U):
X,Y = decode(U)
Vec = V * np.sin(2 * np.pi * x[1] / L)
if int(t/T)%2==0:
return encode(Vec, np.zeros(Vec.shape))
else:
return encode(np.zeros(Vec.shape), Vec)
with initial value
U0 = encode(X,Y)
Then this can be directly integrated over the whole time span.
Why this might be not such a good idea: Thinking of each grid point and its trajectory separately, each trajectory has its own sequence of adapted time steps for the given error level. In integrating all simultaneously, the adapted step size is the minimum over all trajectories at the given time. Thus while the individual trajectories might have only short intervals with very small step sizes amid long intervals with sparse time steps, these can overlap in the ensemble to result in very small step sizes everywhere.
If you go beyond the testing stage, switch to a more compiled solver implementation, odeint is a Fortran code with wrappers, so half a solution. JITcode translates to C code and links with the compiled solver behind odeint. Leaving python you get sundials, the diffeq module of julia-lang, or boost::odeint.
TL;DR
I don't think you can "integrate the differential equations starting for each point of the grid at once".
MWE
Please try to provide a MWE to reproduce your problem, like you said : "I don't think the rest of the code really matters", and it makes it harder for people to understand your problem.
Understanding how to talk to the solver
Before answering your question, there are several things that seem to be misunderstood :
by defining time_steps = np.arange(t, t + T) and then calling solve_ivp(eq, time_steps, solution) : the second argument of solve_ivp is the time span you want the solution for, ie, the "start" and "stop" time as a 2-uple. Here your time_steps is 30-long (for the first loop), so I would probably replace it by (t, t+T). Look for t_span in the doc.
from what I understand, it seems like you want to control each iteration of the numerical resolution : that's not how solve_ivp works. More over, I think you want to switch the function "eq" at each iteration. Since you have to pass the "the right hand side" of the equation, you need to wrap this behavior inside a function. It would not work (see right after) but in terms of concept something like this:
def RHS(t, x):
# unwrap your variables, condition is like an additional variable of your problem,
# with a very simple differential equation
x0, x1, condition = x
# compute new results for x0 and x1
if condition:
x0_out, x1_out = V * np.sin(2 * np.pi * x[1] / L), 0.0
else:
x0_out, x1_out = 0.0, V * np.sin(2 * np.pi * x[1] / L)
# compute new result for condition
condition_out = not(condition)
return [x0_out, x1_out, condition_out]
This would not work because the evolution of condition doesn't satisfy some mathematical properties of derivation/continuity. So condition is like a boolean switch that parametrizes the model, we can use global to control the state of this boolean :
condition = True
def RHS_eq(t, y):
global condition
x0, x1 = y
# compute new results for x0 and x1
if condition:
x0_out, x1_out = V * np.sin(2 * np.pi * x1 / L), 0.0
else:
x0_out, x1_out = 0.0, V * np.sin(2 * np.pi * x1 / L)
# update condition
condition = 0 if condition==1 else 1
return [x0_out, x1_out]
finaly, and this is the ValueError you mentionned in your post : you define solution = np.array([X, Y]) which actually is initial condition and supposed to be "y0: array_like, shape (n,)" where n is the number of variable of the problem (in the case of [x0_out, x1_out] that would be 2)
A MWE for a single initial condition
All that being said, lets start with a simple MWE for a single starting point (0.5,0.5), so we have a clear view of how to use the solver :
import numpy as np
from scipy.integrate import solve_ivp
import matplotlib.pyplot as plt
# initial conditions for x0, x1, and condition
initial = [0.5, 0.5]
condition = True
# time span
t_span = (0, 100)
# constants
V = 1
L = 1
# define the "model", ie the set of equations of t
def RHS_eq(t, y):
global condition
x0, x1 = y
# compute new results for x0 and x1
if condition:
x0_out, x1_out = V * np.sin(2 * np.pi * x1 / L), 0.0
else:
x0_out, x1_out = 0.0, V * np.sin(2 * np.pi * x1 / L)
# update condition
condition = 0 if condition==1 else 1
return [x0_out, x1_out]
solution = solve_ivp(RHS_eq, # Right Hand Side of the equation(s)
t_span, # time span, a 2-uple
initial, # initial conditions
)
fig, ax = plt.subplots()
ax.plot(solution.t,
solution.y[0],
label="x0")
ax.plot(solution.t,
solution.y[1],
label="x1")
ax.legend()
Final answer
Now, what we want is to do the exact same thing but for various initial conditions, and from what I understand, we can't : again, quoting the doc
y0 : array_like, shape (n,) : Initial state. . The solver's initial condition only allows one starting point vector.
So to answer the initial question : I don't think you can "integrate the differential equations starting for each point of the grid at once".
I'm trying to implement a multiclass logistic regression classifier that distinguishes between k different classes.
This is my code.
import numpy as np
from scipy.special import expit
def cost(X,y,theta,regTerm):
(m,n) = X.shape
J = (np.dot(-(y.T),np.log(expit(np.dot(X,theta))))-np.dot((np.ones((m,1))-y).T,np.log(np.ones((m,1)) - (expit(np.dot(X,theta))).reshape((m,1))))) / m + (regTerm / (2 * m)) * np.linalg.norm(theta[1:])
return J
def gradient(X,y,theta,regTerm):
(m,n) = X.shape
grad = np.dot(((expit(np.dot(X,theta))).reshape(m,1) - y).T,X)/m + (np.concatenate(([0],theta[1:].T),axis=0)).reshape(1,n)
return np.asarray(grad)
def train(X,y,regTerm,learnRate,epsilon,k):
(m,n) = X.shape
theta = np.zeros((k,n))
for i in range(0,k):
previousCost = 0;
currentCost = cost(X,y,theta[i,:],regTerm)
while(np.abs(currentCost-previousCost) > epsilon):
print(theta[i,:])
theta[i,:] = theta[i,:] - learnRate*gradient(X,y,theta[i,:],regTerm)
print(theta[i,:])
previousCost = currentCost
currentCost = cost(X,y,theta[i,:],regTerm)
return theta
trX = np.load('trX.npy')
trY = np.load('trY.npy')
theta = train(trX,trY,2,0.1,0.1,4)
I can verify that cost and gradient are returning values that are in the right dimension (cost returns a scalar, and gradient returns a 1 by n row vector), but i get the error
RuntimeWarning: divide by zero encountered in log
J = (np.dot(-(y.T),np.log(expit(np.dot(X,theta))))-np.dot((np.ones((m,1))-y).T,np.log(np.ones((m,1)) - (expit(np.dot(X,theta))).reshape((m,1))))) / m + (regTerm / (2 * m)) * np.linalg.norm(theta[1:])
why is this happening and how can i avoid this?
The proper solution here is to add some small epsilon to the argument of log function. What worked for me was
epsilon = 1e-5
def cost(X, y, theta):
m = X.shape[0]
yp = expit(X # theta)
cost = - np.average(y * np.log(yp + epsilon) + (1 - y) * np.log(1 - yp + epsilon))
return cost
You can clean up the formula by appropriately using broadcasting, the operator * for dot products of vectors, and the operator # for matrix multiplication — and breaking it up as suggested in the comments.
Here is your cost function:
def cost(X, y, theta, regTerm):
m = X.shape[0] # or y.shape, or even p.shape after the next line, number of training set
p = expit(X # theta)
log_loss = -np.average(y*np.log(p) + (1-y)*np.log(1-p))
J = log_loss + regTerm * np.linalg.norm(theta[1:]) / (2*m)
return J
You can clean up your gradient function along the same lines.
By the way, are you sure you want np.linalg.norm(theta[1:]). If you're trying to do L2-regularization, the term should be np.linalg.norm(theta[1:]) ** 2.
Cause:
This is happening because in some cases, whenever y[i] is equal to 1, the value of the Sigmoid function (theta) also becomes equal to 1.
Cost function:
J = (np.dot(-(y.T),np.log(expit(np.dot(X,theta))))-np.dot((np.ones((m,1))-y).T,np.log(np.ones((m,1)) - (expit(np.dot(X,theta))).reshape((m,1))))) / m + (regTerm / (2 * m)) * np.linalg.norm(theta[1:])
Now, consider the following part in the above code snippet:
np.log(np.ones((m,1)) - (expit(np.dot(X,theta))).reshape((m,1)))
Here, you are performing (1 - theta) when the value of theta is 1. So, that will effectively become log (1 - 1) = log (0) which is undefined.
I'm guessing your data has negative values in it. You can't log a negative.
import numpy as np
np.log(2)
> 0.69314718055994529
np.log(-2)
> nan
There are a lot of different ways to transform your data that should help, if this is the case.
def cost(X, y, theta):
yp = expit(X # theta)
cost = - np.average(y * np.log(yp) + (1 - y) * np.log(1 - yp))
return cost
The warning originates from np.log(yp) when yp==0 and in np.log(1 - yp) when yp==1. One option is to filter out these values, and not to pass them into np.log. The other option is to add a small constant to prevent the value from being exactly 0 (as suggested in one of the comments above)
Add epsilon value[which is a miniature value] to the log value so that it won't be a problem at all.
But i am not sure if it will give accurate results or not .
I'd like to implement Euler's method (the explicit and the implicit one)
(https://en.wikipedia.org/wiki/Euler_method) for the following model:
x(t)' = q(x_M -x(t))x(t)
x(0) = x_0
where q, x_M and x_0 are real numbers.
I know already the (theoretical) implementation of the method. But I couldn't figure out where I can insert / change the model.
Could anybody help?
EDIT: You were right. I didn't understand correctly the method. Now, after a few hours, I think that I really got it! With the explicit method, I'm pretty sure (nevertheless: could anybody please have a look at my code? )
With the implicit implementation, I'm not very sure if it's correct. Could please anyone have a look at the implementation of the implicit method and give me a feedback what's correct / not good?
def explizit_euler():
''' x(t)' = q(xM -x(t))x(t)
x(0) = x0'''
q = 2.
xM = 2
x0 = 0.5
T = 5
dt = 0.01
N = T / dt
x = x0
t = 0.
for i in range (0 , int(N)):
t = t + dt
x = x + dt * (q * (xM - x) * x)
print '%6.3f %6.3f' % (t, x)
def implizit_euler():
''' x(t)' = q(xM -x(t))x(t)
x(0) = x0'''
q = 2.
xM = 2
x0 = 0.5
T = 5
dt = 0.01
N = T / dt
x = x0
t = 0.
for i in range (0 , int(N)):
t = t + dt
x = (1.0 / (1.0 - q *(xM + x) * x))
print '%6.3f %6.3f' % (t, x)
Pre-emptive note: Although the general idea should be correct, I did all the algebra in place in the editor box so there might be mistakes there. Please, check it yourself before using for anything really important.
I'm not sure how you come to the "implicit" formula
x = (1.0 / (1.0 - q *(xM + x) * x))
but this is wrong and you can check it by comparing your "explicit" and "implicit" results: they should slightly diverge but with this formula they will diverge drastically.
To understand the implicit Euler method, you should first get the idea behind the explicit one. And the idea is really simple and is explained at the Derivation section in the wiki: since derivative y'(x) is a limit of (y(x+h) - y(x))/h, you can approximate y(x+h) as y(x) + h*y'(x) for small h, assuming our original differential equation is
y'(x) = F(x, y(x))
Note that the reason this is only an approximation rather than exact value is that even over small range [x, x+h] the derivative y'(x) changes slightly. It means that if you want to get a better approximation of y(x+h), you need a better approximation of "average" derivative y'(x) over the range [x, x+h]. Let's call that approximation just y'. One idea of such improvement is to find both y' and y(x+h) at the same time by saying that we want to find such y' and y(x+h) that y' would be actually y'(x+h) (i.e. the derivative at the end). This results in the following system of equations:
y'(x+h) = F(x+h, y(x+h))
y(x+h) = y(x) + h*y'(x+h)
which is equivalent to a single "implicit" equation:
y(x+h) - y(x) = h * F(x+h, y(x+h))
It is called "implicit" because here the target y(x+h) is also a part of F. And note that quite similar equation is mentioned in the Modifications and extensions section of the wiki article.
So now going to your case that equation becomes
x(t+dt) - x(t) = dt*q*(xM -x(t+dt))*x(t+dt)
or equivalently
dt*q*x(t+dt)^2 + (1 - dt*q*xM)*x(t+dt) - x(t) = 0
This is a quadratic equation with two solutions:
x(t+dt) = [(dt*q*xM - 1) ± sqrt((dt*q*xM - 1)^2 + 4*dt*q*x(t))]/(2*dt*q)
Obviously we want the solution that is "close" to the x(t) which is the + solution. So the code should be something like:
b = (q * xM * dt - 1)
x(t+h) = (b + (b ** 2 + 4 * q * x(t) * dt) ** 0.5) / 2 / q / dt
(editor note:) Applying the binomial complement, this formula has the numerically more stable form for small dt, where then b < 0,
x(t+h) = (2 * x(t)) / ((b ** 2 + 4 * q * x(t) * dt) ** 0.5 - b)