Look at this code:
import theano
import numpy
import theano.tensor as T
import numpy as np
x = T.dvector('x')
y = T.dvector('y')
def fun(x,a):
return x+a
results, updates = theano.scan(fn=fun,sequences=dict(input=x), outputs_info=dict(initial=y, taps=[-3]))
h = [10.,20,30,40,50,60,70]
f = theano.function([x, y], results)
g = theano.function([y], y)
print(f([1],h))
I have changed outputs_info'taps to -2,-3,and so on, but the the result of code is the same [11.0], I can't understand. Somebody can explain it?
Another question.
import theano
import numpy
import theano.tensor as T
import numpy as np
x = T.dvector('x')
y = T.dvector('y')
def fun(x,a,b):
return x+a+b
results, updates = theano.scan(fn=fun,sequences=dict(input=x), outputs_info=dict(initial=y, taps=[-5,-3]))
h = [10.,20,30,40,50,60,70]
f = theano.function([x, y], results)
g = theano.function([y], y)
print(f([1,2,3,4],h))
The output is [41,62,83,85], how does 85 come?
Consider this variation on your code:
x = T.dvector('x')
y = T.dvector('y')
def fun(x,a,b):
return x+b
results, updates = theano.scan(
fn=fun,
sequences=dict(input=x),
outputs_info=dict(initial=y, taps=[-5,-3])
)
h = [10.,20,30,40,50,60,70]
f = theano.function([x, y], results)
g = theano.function([y], y)
print(f([1],h))
Your result will be 31.
Change taps to [-5, -2] and your result changes to 41.
Change taps to [-4, -3] and your result changes to 21.
This demonstrates how things are working:
The largest negative number in taps is treated as h[0]
All other taps are offset from that
So when taps is [-5,-2] fun inputs a and b = 10 and 40 respectively.
Update for new question
taps actually indicates that the the function at time t depends on the output of the function at time t - taps.
For instance, the Fibonacci sequence is defined by the function
Here's how you'd implement the Fibonacci sequence with theano.scan:
x = T.ivector('x')
y = T.ivector('y')
def fibonacci(x,a,b):
return a+b
results, _ = theano.scan(
fn=fibonacci,
sequences=dict(input=x),
outputs_info=dict(initial=y, taps=[-2,-1])
)
h = [1,1]
f = theano.function([x, y], results)
print(np.append(h, f(range(10),h)))
However, theano.scan has a problem. If the function depends on prior output, what do you use as the prior output for the first iteration?
The answer is the initial input, h in your case. But in your case h is longer than you need it to be, you only need it to be 5 elements long (because the largest taps is -5 in your case). After using the required 5 elements of h, your function switches over to the actual output from your function.
Here's a simplified trace of what's happening in your code:
output[0] = x[0] + h[0] + h[2] = 41
output[1] = x[1] + h[1] + h[3] = 62
output[2] = x[2] + h[2] + h[4] = 83
output[3] = x[3] + h[3] + output[0] = 85
You'll see, at time = 4, we have an output from the function for time 4-3, and that output is 41. And since we have that output, we need to use it, because the function is defined as using prior outputs. So we just ignore the rest of h.
Related
I am new to Python and have no way how to find the area under a curve given a function. If my function is for example 3x^2+2x+11, how can I even go about doing this?
I would like to complete this using approximation.
You could use SymPy to integrate for you, then you just need to plug in the endpoints. For example:
from sympy import Poly
from sympy.abc import x
f = Poly(3*x**2 + 2*x + 11) # Or `Poly((3, 2, 11), x)`
g = f.integrate()
# > Poly(x**3 + x**2 + 11*x, x, domain='QQ')
start, end = -1, 1
result = g(end) - g(start)
# > 24
I just built this which does approximations.
The integrate function takes a function as its first argument.
requires upper and lower bounds
works out the rectangle areas
adds them together
def integrate(f, a:float, b:float) -> float:
''' given a function, work out the integral '''
area = 0
x = a
parts = 100000
for i in range(parts):
dx = (b-a)/parts
y_0 = f(x)
y_1 = f(x+dx)
x = x+dx
height = (y_1 + y_0) /2
area = area + (height*dx)
return area
def f(x): return 3*x**3 + x**2 + 11
r = integrate(f, 0, 1)
print(r)
result for the given example:
12.08333333342187
Given this Matlab Code created by my teacher:
function [] = explicitWave(T,L,N,J)
% Explicit method for the wave eq.
% T: Length time-interval
% L: Length x-interval
% N: Number of time-intervals
% J: Number of x-intervals
k=T/N;
h=L/J;
r=(k*k)/(h*h);
k/h
x=linspace(0,L,J+1); % number of points = number of intervals + 1
uOldOld=f(x); % solution two time-steps backwards. Initial condition
disp(uOldOld)
uOld=zeros(1,length(x)); % solution at previuos time-step
uNext=zeros(1,length(x));
% First time-step
for j=2:J
uOld(j)=(1-r)*f(x(j))+r/2*(f(x(j+1))+f(x(j-1)))+k*g(x(j));
end
% Remaining time-steps
for n=0:N-1
for j=2:J
uNext(j)=2*(1-r)*uOld(j)+r*(uOld(j+1)+uOld(j-1))-uOldOld(j);
end
uOldOld=uOld;
uOld=uNext;
end
plot(x,uNext,'r')
end
I tried to implement this in Python by using this code:
import numpy as np
import matplotlib.pyplot as plt
def explicit_wave(f, g, T, L, N, J):
"""
:param T: Length of Time Interval
:param L: Length of X-interval
:param N: Number of time intervals
:param J: Number of X-intervals
:return:
"""
k = T/N
h = L/J
r = (k**2) / (h**2)
x = np.linspace(0, L, J+1)
Uoldold = f(x)
Uold = np.zeros(len(x))
Unext = np.zeros(len(x))
for j in range(1, J):
Uold[j] = (1-r)*f(x[j]) + (r/2)*(f(x[j+1]) + f(x[j-1])) + k*g(x[j])
for n in range(N-1):
for j in range(1, J):
Unext[j] = 2*(1-r) * Uold[j]+r*(Uold[j+1]+Uold[j-1]) - Uoldold[j]
Uoldold = Uold
Uold = Unext
plt.plot(x, Unext)
plt.show()
return Unext, x
However when I run the code with the same inputs, I get different results when plotting them. My inputs:
g = lambda x: -np.sin(2*np.pi*x)
f = lambda x: 2*np.sin(np.pi*x)
T = 8.0
L = 1.0
J = 60
N = 480
Python plot result compared to exact result. The x-es represent the actual solution, and the red line is the function:
Matlab plot result , x-es represent the exact solution and the red line is the function:
Could you see any obvious errors I might have made when translating this code?
In case anyone needs the exact solution:
exact = lambda x,t: 2*np.sin(np.pi*x)*np.cos(np.pi*t) - (1/(2*np.pi))*np.sin(2*np.pi*x)*np.sin(2*np.pi*t)
I found the error through debugging. The main problem here is the code:
Uoldold = Uold
Uold = Unext
So in Python when you define a new variable as equal to an older variable, they become references to each other (i.e dependent on each other). Let me illustrate this as an example consisting of lists:
a = [1,2,3,4]
b = a
b[1] = 10
print(a)
>> [1, 10, 3, 4]
So the solution here was to use .copy()
Resulting in this:
Uoldold = Uold.copy()
Uold = Unext.copy()
I am new to programming in general, however I am trying really hard for a project to randomly choose some outcomes depending on the probability of that outcome happening for lotteries that i have generated and i would like to use a loop to get random numbers each time.
This is my code:
import numpy as np
p = np.arange(0.01, 1, 0.001, dtype = float)
alpha = 0.5
alpha = float(alpha)
alpha = np.zeros((1, len(p))) + alpha
def w(alpha, p):
return np.exp(-(-np.log(p))**alpha)
w = w(alpha, p)
def P(w):
return np.exp(np.log2(w))
prob_win = P(w)
prob_lose = 1 - prob_win
E = 10
E = float(E)
E = np.zeros((1, len(p))) + E
b = 0
b = float(b)
b = np.zeros((1, len(p))) + b
def A(E, b, prob_win):
return (E - b * (1 - prob_win)) / prob_win
a = A(E, b, prob_win)
a = a.squeeze()
prob_array = (prob_win, prob_lose)
prob_matrix = np.vstack(prob_array).T.squeeze()
outcomes_array = (a, b)
outcomes_matrix = np.vstack(outcomes_array).T
outcome_pairs = np.vsplit(outcomes_matrix, len(p))
outcome_pairs = np.array(outcome_pairs).astype(np.float)
prob_pairs = np.vsplit(prob_matrix, len(p))
prob_pairs = np.array(prob_pairs)
nominalized_prob_pairs = [outcome_pairs / np.sum(outcome_pairs) for
outcome_pairs in np.vsplit(prob_pairs, len(p)) ]
The code works fine but I would like to use a loop or something similar for the next line of code as I want to get for each row/ pair of probabilities to get 5 realizations. When i use size = 5 i just get a really long list but I do not know which values still belong to the pairs as when size = 1
realisations = np.concatenate([np.random.choice(outcome_pairs[i].ravel(),
size=1 , p=nominalized_prob_pairs[i].ravel()) for i in range(len(outcome_pairs))])
or if I use size=5 as below how can I match the realizations to the initial probabilities? Do i need to cut the array after every 5th element and then store the values in a matrix with 5 columns and a new row for every 5th element of the initial array? if yes how could I do this?
realisations = np.concatenate([np.random.choice(outcome_pairs[i].ravel(),
size=1 , p=nominalized_prob_pairs[i].ravel()) for i in range(len(outcome_pairs))])
What are you trying to produce exactly ? Be more concise.
Here is a starter clean code where you can produce linear data.
import numpy as np
def generate_data(n_samples, variance):
# generate 2D data
X = np.random.random((n_samples, 1))
# adding a vector of ones to ease calculus
X = np.concatenate((np.ones((n_samples, 1)), X), axis=1)
# generate two random coefficients
W = np.random.random((2, 1))
# construct targets with our data and weights
y = X # W
# add some noise to our data
y += np.random.normal(0, variance, (n_samples, 1))
return X, y, W
if __name__ == "__main__":
X, Y, W = generate_data(10, 0.5)
# check random value of x for example
for x in X:
print(x, end=' --> ')
if x[1] <= 0.4:
print('prob <= 0.4')
else:
print('prob > 0.4')
I am trying to implement a non parametric estimation of the KL divergence shown in this paper
Here is my code:
import numpy as np
import math
import itertools
import random
from scipy.interpolate import interp1d
def log(x):
if x > 0: return math.log(x)
else: return 0
g = lambda x, inp,N : sum(0.5 + 0.5 * np.sign(x-inp))/N
def ecdf(x,N):
out = [g(i,x,N) for i in x]
fun = interp1d(x, out, kind='linear', bounds_error = False, fill_value = (0,1))
return fun
def KL_est(x,y):
ex = min(np.diff(sorted(np.unique(x))))
ey = min(np.diff(sorted(np.unique(y))))
e = min(ex,ey) * 0.9
N = len(x)
x.sort()
y.sort()
P = ecdf(x,N)
Q = ecdf(y,N)
KL = sum(log(v) for v in ((P(x)-P(x-e))/(Q(x)-Q(x-e))) ) / N
return KL
My trouble is with scipy interp1d. I am using the function returned from interp1d to find the value of new inputs. The problem is, some of the input values are very close (10^-5 apart) and the function returns the same value for both. In my code above, Q(x) - Q(x-e) leads to a divide by zero error.
Here is some test code that reproduces the problem:
x = np.random.normal(0, 1, 10)
y = np.random.normal(0, 1, 10)
ex = min(np.diff(sorted(np.unique(x))))
ey = min(np.diff(sorted(np.unique(y))))
e = min(ex,ey) * 0.9
N = len(x)
x.sort()
y.sort()
P = ecdf(x,N)
Q = ecdf(y,N)
KL = sum(log(v) for v in ((P(x)-P(x-e))/(Q(x)-Q(x-e))) ) / N
How would I go about getting a more accurate interpolation?
As e gets small you are effectively trying to compute the ratio of derivatives of P and Q numerically. As you are finding, you run out of precision really quickly in floating point doing it this way.
An alternate approach would be to use an interpolation function that can return derivatives directly. For example, you could try scipy.interpolate.InterpolatedUnivariateSpline. You were saying kind='linear' to interp1d, so the equivalent is k=1. Once you construct it, the spline has method derivatives() that gives you all the derivatives at different points. For small values of e you could switch to using the derivative.
I have been trying to use fmin_cg to minimize cost function for Logistic Regression.
xopt = fmin_cg(costFn, fprime=grad, x0= initial_theta,
args = (X, y, m), maxiter = 400, disp = True, full_output = True )
This is how I call my fmin_cg
Here is my CostFn:
def costFn(theta, X, y, m):
h = sigmoid(X.dot(theta))
J = 0
J = 1 / m * np.sum((-(y * np.log(h))) - ((1-y) * np.log(1-h)))
return J.flatten()
Here is my grad:
def grad(theta, X, y, m):
h = sigmoid(X.dot(theta))
J = 1 / m * np.sum((-(y * np.log(h))) - ((1-y) * np.log(1-h)))
gg = 1 / m * (X.T.dot(h-y))
return gg.flatten()
It seems to be throwing this error:
/Users/sugethakch/miniconda2/lib/python2.7/site-packages/scipy/optimize/linesearch.pyc in phi(s)
85 def phi(s):
86 fc[0] += 1
---> 87 return f(xk + s*pk, *args)
88
89 def derphi(s):
ValueError: operands could not be broadcast together with shapes (3,) (300,)
I know it's something to do with my dimensions. But I can't seem to figure it out.
I am noob, so I might be making an obvious mistake.
I have read this link:
fmin_cg: Desired error not necessarily achieved due to precision loss
But, it somehow doesn't seem to work for me.
Any help?
Updated size for X,y,m,theta
(100, 3) ----> X
(100, 1) -----> y
100 ----> m
(3, 1) ----> theta
This is how I initialize X,y,m:
data = pd.read_csv('ex2data1.txt', sep=",", header=None)
data.columns = ['x1', 'x2', 'y']
x1 = data.iloc[:, 0].values[:, None]
x2 = data.iloc[:, 1].values[:, None]
y = data.iloc[:, 2].values[:, None]
# join x1 and x2 to make one array of X
X = np.concatenate((x1, x2), axis=1)
m, n = X.shape
ex2data1.txt:
34.62365962451697,78.0246928153624,0
30.28671076822607,43.89499752400101,0
35.84740876993872,72.90219802708364,0
.....
If it helps, I am trying to re-code one of the homework assignments for the Coursera's ML course by Andrew Ng in python
Finally, I figured out what the problem in my initial program was.
My 'y' was (100, 1) and the fmin_cg expects (100, ). Once I flattened my 'y' it no longer threw the initial error. But, the optimization wasn't working still.
Warning: Desired error not necessarily achieved due to precision loss.
Current function value: 0.693147
Iterations: 0
Function evaluations: 43
Gradient evaluations: 41
This was the same as what I achieved without optimization.
I figured out the way to optimize this was to use the 'Nelder-Mead' method. I followed this answer: scipy is not optimizing and returns "Desired error not necessarily achieved due to precision loss"
Result = op.minimize(fun = costFn,
x0 = initial_theta,
args = (X, y, m),
method = 'Nelder-Mead',
options={'disp': True})#,
#jac = grad)
This method doesn't need a 'jacobian'.
I got the results I was looking for,
Optimization terminated successfully.
Current function value: 0.203498
Iterations: 157
Function evaluations: 287
Well, since I don't know exactly how your initializing m, X, y, and theta I had to make some assumptions. Hopefully my answer is relevant:
import numpy as np
from scipy.optimize import fmin_cg
from scipy.special import expit
def costFn(theta, X, y, m):
# expit is the same as sigmoid, but faster
h = expit(X.dot(theta))
# instead of 1/m, I take the mean
J = np.mean((-(y * np.log(h))) - ((1-y) * np.log(1-h)))
return J #should be a scalar
def grad(theta, X, y, m):
h = expit(X.dot(theta))
J = np.mean((-(y * np.log(h))) - ((1-y) * np.log(1-h)))
gg = (X.T.dot(h-y))
return gg.flatten()
# initialize matrices
X = np.random.randn(100,3)
y = np.random.randn(100,) #this apparently needs to be a 1-d vector
m = np.ones((3,)) # not using m, used np.mean for a weighted sum (see ali_m's comment)
theta = np.ones((3,1))
xopt = fmin_cg(costFn, fprime=grad, x0=theta, args=(X, y, m), maxiter=400, disp=True, full_output=True )
While the code runs, I don't know enough about your problem to know if this is what you're looking for. But hopefully this can help you understand the problem better. One way to check your answer is to call fmin_cg with fprime=None and see how the answers compare.