I'm having having some difficulty implementing a negative log likelihood function in python
My Negative log likelihood function is given as:
This is my implementation but i keep getting error:ValueError: shapes (31,1) and (2458,1) not aligned: 1 (dim 1) != 2458 (dim 0)
def negative_loglikelihood(X, y, theta):
J = np.sum(-y # X # theta) + np.sum(np.exp(X # theta))+ np.sum(np.log(y))
return J
X is a dataframe of size:(2458, 31), y is a dataframe of size: (2458, 1) theta is dataframe of size: (31,1)
i cannot fig out what am i missing. Is my implementation incorrect somehow? Any help would be much appreciated. thanks
You cannot use matrix multiplication here, what you want is multiplying elements with the same index together, ie element wise multiplication. The correct operator is * for this purpose.
Moreover, you must transpose theta so numpy can broadcast the dimension with size 1 to 2458 (same for y: 1 is broadcasted to 31.)
x = np.random.rand(2458, 31)
y = np.random.rand(2458, 1)
theta = np.random.rand(31, 1)
def negative_loglikelihood(x, y, theta):
J = np.sum(-y * x * theta.T) + np.sum(np.exp(x * theta.T))+ np.sum(np.log(y))
return J
negative_loglikelihood(x, y, theta)
>>> 88707.699
EDIT: your formula includes a y! inside the logarithm, you should also update your code to match.
If you look at your equation you are passing yixiθ is Summing over i=1 to M so it means you should pass the same i over y and x otherwise pass the separate function over it.
Related
I am currently trying to work through the MIT Deep Learning and Computer Vision course (EECS 498-007 / 598-005) Assignment 1 by myself, which seems to have a rough equivalent in Stanford CS 231n.
Problem-formulation:
Create a function, which computes the pairwise euclidean distance inputs: xtrain,xtest. Dimensions: [N,x,x] and [M,x,x] (with x being the same number)
output: distance-matrix of shape [N,M] expressing the distance between each training point and each testing point.
There is given a hint in the assignment:
Try to formulate the Euclidean distance using two broadcast sums and a matrix multiply.
I am trying to implement this mathematical operation using broadcasting, where the middle term is a simple matrix-multiplication
I am struggling with the tensor-shapes. My implementation so far is as follows:
def euc_no_loop(x,y):
#hint: two broadcast sums
xsq = torch.sum(x**2,axis=1)
print(xsq.shape)
ysq = torch.sum(y**2,axis=1)
print(ysq.shape)
#and one matrix multiply
mixprod = -2 * x.view(x.shape[0],-1).matmul(y.view(y.shape[0],-1).T)
print(mixprod.shape)
euc_dist = torch.sqrt(xsq + mixprod + ysq.unsqueeze(1).T)
return euc_dist
With inputs being:
x = torch.randn(5,3,3)
y = torch.randn(3,3,3)
shapes become:
xsq: [5,3]
ysq: [3,3]
mixprod: [5,3]
And output dimension becomes [3,5,3].
Many other StackOverflow threads exist, where numpy is used - but the numpy dot-product seems to be more flexible than torch.matmul.
Example on numpy-solution: Compute L2 distance with numpy using matrix multiplication
I simply don't understand where I am going wrong.
Input tensors probably should have two dimensions in order to compute pairwise distance. So I assume that those x by x matrices should be summed altogether like (N, x, x) => (N)
def euc_no_loop(x, y):
# Suppose x has (N, x, x) and Y has (M, x, x) dimensions
xsq = torch.sum(x**2, dim=(1, 2)) # (N,)
print(xsq.shape)
ysq = torch.sum(y**2, dim=(1, 2)) # (M,)
print(ysq.shape)
mixprod = -2 * x.view(x.shape[0], -1) # y.view(y.shape[0], -1).T # (N, M)
print(mixprod.shape)
euc_dist = torch.sqrt(xsq.unsqueeze(1) + mixprod + ysq.unsqueeze(0)) # (N,1)+(N,M)+(1,M) => (N, M)
return euc_dist
Or just flatten the input tensors
x = x.flatten(start_dim=1)
y = y.flatten(start_dim=1)
I'm trying to reshape the array with some .csv value and it' giving me an error for multiple lines and I have tried to find some examples over a StackOverflow but I wasn't able to figure it out what's the actual problem here !! and I'm getting this error every time - whenever I'm trying to use np.zeros(), np.ones(), and np.array()
I have exam data that consist EXAM-1, EXAM-2 and Admission decision. My x, y, and theta are not in the same size?
def sigmoid(z):
new_val = 1 / (1 + np.exp(-z))
return new_val
def h(theta,X):
return sigmoid(np.dot(X,theta))#------Value Error
def compute_logistic_cost(theta, X, y):
m= len(y)
J = (1/m) * np.sum((-y * np.log(h(theta,X))) - ((1 - y)*np.log(1 - h(theta,X))))#-----Value Error
eps = 1e-12
hypothesis[hypothesis < eps] = eps
eps = 1.0 - 1e-12
hypothesis[hypothesis > eps] = eps
return J
X = np.ones( (3, 1) )#------Value Error and If I put 100 instead of 1 it is working
X[1:,:] = X.T
theta = np.zeros( (3, 1) )
print(compute_logistic_cost(theta, X, y))#------Value Error
theta = np.array([[1.0],
[1.0],
[1.0]])
print(compute_logistic_cost(theta, X, y))
theta = np.array([[0.1],
[0.1],
[0.1]])
print(compute_logistic_cost(theta, X, y))
Following is the error message please help me to understand. ValueError: shapes (3,100) and (3,1) not aligned: 100 (dim 1) != 3 (dim 0)
It looks like this is a mathematical failure - taking the dot product of the two matrices will fail due to matrix incompatibility. See here for the doc showing this value error. Specifically:
Raises:
ValueError
If the last dimension of a is not the same size as the second-to-last dimension of b.
Matrix math is hard. It looks like you just need the tranpose of the smaller matrix for the dot product to successfully execute. I do not know enough to tell you whether the rest of the script is correct, but this will at least clear the error for you to continue.
Hope that helps.
I am trying to find the mse for a given Phi, with output y, and calculated weights w. While trying to implement (y - w(transpose) * Phi ) in w(transpose) * Phi i am getting Value error. I know this is dimension error, but I've tried to change it and its not working for me.
I've tried transpose (but its not really transposing, just stays as it is), and reshape.
X=[1,2,3]
d=3
Phi=np.polynomial.polynomial.polyvander(X,d)
y=[2,3,4]
def train_model(Phi, y):
pht = np.matrix.transpose(Phi)
u = np.matmul(pht,Phi)
q = np.linalg.inv(u)
s = np.matmul(q,pht)
w = np.matmul(s,y)
return w
w=train_model(Phi,y)
def evaluate_model(Phi, y, w):
sum=0
wt = np.matrix.transpose(w)
for i in range (0,len(y)):
g = np.matmul(wt,Phi[:,i])
k = y[i]-g
l = k ** 2
sum+=l
avg=sum/len(y)
return avg
Edit:
The error I get is
ValueError: shapes (4,) and (53,) not aligned: 4 (dim 0) != 53 (dim 0)
It looks like your indexing is wrong, try
g = np.matmul(wt,Phi[i,:])
I'm trying to create a matrix with values based on x,y values I have stored in a tuple. I use a loop to iterate over the tuple and perform a simple calculation on the data:
import numpy as np
# Trying to fit quadratic equation to the measured dots
N = 6
num_of_params = 3
# x values
x = (1,4,3,5,2,6)
# y values
y = (3.96, 24.96,14.15,39.8,7.07,59.4)
# X is a matrix N * 3 with the x values to the power of {0,1,2}
X = np.zeros((N,3))
Y = np.zeros((N,1))
print X,"\n\n",Y
for i in range(len(x)):
for p in range(num_of_params):
X[i][p] = x[i]**(num_of_params - p - 1)
Y[i] = y[i]
print "\n\n"
print X,"\n\n",Y
Is this can be achieved in an easier way? I'm looking for some way to init the matrix like X = np.zeros((N,3), read_values_from = x)
Is it possible? Is there another simple way?
Python 2.7
Extend array version of x to 2D with a singleton dim (dim with length=1) along the second one using np.newaxis/None. This lets us leverage NumPy broadcasting to get the 2D output in a vectorized manner. Similar philosophy for y.
Hence, the implementation would be -
X = np.asarray(x)[:,None]**(num_of_params - np.arange(num_of_params) - 1)
Y = np.asarray(y)[:,None]
Or use the built-in outer method for np.power to get X that takes care of the array conversion under the hoods -
X = np.power.outer(x, num_of_params - np.arange(num_of_params) - 1)
Alternatively, for Y, use np.expand_dims -
Y = np.expand_dims(y,1)
I have been trying to use fmin_cg to minimize cost function for Logistic Regression.
xopt = fmin_cg(costFn, fprime=grad, x0= initial_theta,
args = (X, y, m), maxiter = 400, disp = True, full_output = True )
This is how I call my fmin_cg
Here is my CostFn:
def costFn(theta, X, y, m):
h = sigmoid(X.dot(theta))
J = 0
J = 1 / m * np.sum((-(y * np.log(h))) - ((1-y) * np.log(1-h)))
return J.flatten()
Here is my grad:
def grad(theta, X, y, m):
h = sigmoid(X.dot(theta))
J = 1 / m * np.sum((-(y * np.log(h))) - ((1-y) * np.log(1-h)))
gg = 1 / m * (X.T.dot(h-y))
return gg.flatten()
It seems to be throwing this error:
/Users/sugethakch/miniconda2/lib/python2.7/site-packages/scipy/optimize/linesearch.pyc in phi(s)
85 def phi(s):
86 fc[0] += 1
---> 87 return f(xk + s*pk, *args)
88
89 def derphi(s):
ValueError: operands could not be broadcast together with shapes (3,) (300,)
I know it's something to do with my dimensions. But I can't seem to figure it out.
I am noob, so I might be making an obvious mistake.
I have read this link:
fmin_cg: Desired error not necessarily achieved due to precision loss
But, it somehow doesn't seem to work for me.
Any help?
Updated size for X,y,m,theta
(100, 3) ----> X
(100, 1) -----> y
100 ----> m
(3, 1) ----> theta
This is how I initialize X,y,m:
data = pd.read_csv('ex2data1.txt', sep=",", header=None)
data.columns = ['x1', 'x2', 'y']
x1 = data.iloc[:, 0].values[:, None]
x2 = data.iloc[:, 1].values[:, None]
y = data.iloc[:, 2].values[:, None]
# join x1 and x2 to make one array of X
X = np.concatenate((x1, x2), axis=1)
m, n = X.shape
ex2data1.txt:
34.62365962451697,78.0246928153624,0
30.28671076822607,43.89499752400101,0
35.84740876993872,72.90219802708364,0
.....
If it helps, I am trying to re-code one of the homework assignments for the Coursera's ML course by Andrew Ng in python
Finally, I figured out what the problem in my initial program was.
My 'y' was (100, 1) and the fmin_cg expects (100, ). Once I flattened my 'y' it no longer threw the initial error. But, the optimization wasn't working still.
Warning: Desired error not necessarily achieved due to precision loss.
Current function value: 0.693147
Iterations: 0
Function evaluations: 43
Gradient evaluations: 41
This was the same as what I achieved without optimization.
I figured out the way to optimize this was to use the 'Nelder-Mead' method. I followed this answer: scipy is not optimizing and returns "Desired error not necessarily achieved due to precision loss"
Result = op.minimize(fun = costFn,
x0 = initial_theta,
args = (X, y, m),
method = 'Nelder-Mead',
options={'disp': True})#,
#jac = grad)
This method doesn't need a 'jacobian'.
I got the results I was looking for,
Optimization terminated successfully.
Current function value: 0.203498
Iterations: 157
Function evaluations: 287
Well, since I don't know exactly how your initializing m, X, y, and theta I had to make some assumptions. Hopefully my answer is relevant:
import numpy as np
from scipy.optimize import fmin_cg
from scipy.special import expit
def costFn(theta, X, y, m):
# expit is the same as sigmoid, but faster
h = expit(X.dot(theta))
# instead of 1/m, I take the mean
J = np.mean((-(y * np.log(h))) - ((1-y) * np.log(1-h)))
return J #should be a scalar
def grad(theta, X, y, m):
h = expit(X.dot(theta))
J = np.mean((-(y * np.log(h))) - ((1-y) * np.log(1-h)))
gg = (X.T.dot(h-y))
return gg.flatten()
# initialize matrices
X = np.random.randn(100,3)
y = np.random.randn(100,) #this apparently needs to be a 1-d vector
m = np.ones((3,)) # not using m, used np.mean for a weighted sum (see ali_m's comment)
theta = np.ones((3,1))
xopt = fmin_cg(costFn, fprime=grad, x0=theta, args=(X, y, m), maxiter=400, disp=True, full_output=True )
While the code runs, I don't know enough about your problem to know if this is what you're looking for. But hopefully this can help you understand the problem better. One way to check your answer is to call fmin_cg with fprime=None and see how the answers compare.