Optimal combination of linked-buckets - python

Let's say I have the following (always binary) options:
import numpy as np
a=np.array([1, 1, 0, 0, 1, 1, 1])
b=np.array([1, 1, 0, 0, 1, 0, 1])
c=np.array([1, 0, 0, 1, 0, 0, 0])
d=np.array([1, 0, 1, 1, 0, 0, 0])
And I want to find the optimal combination of the above that get's me to at least, with minimal above:
req = np.array([50,50,20,20,100,40,10])
For example:
final = X1*a + X2*b + X3*c + X4*d
Does this map to a known operational research problem? Or does it fall under mathematical programming?
Is this NP-hard, or exactly solveable in a reasonable amount of time (I've assumed it's combinatorally impossible to solve exactly)
Are there know solutions to this?
Note: The actual length of arrays are longer - think ~50, and the number of options are ~20
My current research has led me to some combination of the assignment problem and knapsack, but not too sure.

It's a covering problem, easily solvable using an integer program solver (I used OR-Tools below). If the X variables can be fractional, substitute NumVar for IntVar. If the X variables are 0--1, substitute BoolVar.
import numpy as np
a = np.array([1, 1, 0, 0, 1, 1, 1])
b = np.array([1, 1, 0, 0, 1, 0, 1])
c = np.array([1, 0, 0, 1, 0, 0, 0])
d = np.array([1, 0, 1, 1, 0, 0, 0])
opt = [a, b, c, d]
req = np.array([50, 50, 20, 20, 100, 40, 10])
from ortools.linear_solver import pywraplp
solver = pywraplp.Solver.CreateSolver("SCIP")
x = [solver.IntVar(0, solver.infinity(), "x{}".format(i)) for i in range(len(opt))]
extra = [solver.NumVar(0, solver.infinity(), "y{}".format(j)) for j in range(len(req))]
for j, (req_j, extra_j) in enumerate(zip(req, extra)):
solver.Add(extra_j == sum(opt_i[j] * x_i for (opt_i, x_i) in zip(opt, x)) - req_j)
solver.Minimize(sum(extra))
status = solver.Solve()
if status == pywraplp.Solver.OPTIMAL:
print("Solution:")
print("Objective value =", solver.Objective().Value())
for i, x_i in enumerate(x):
print("x{} = {}".format(i, x[i].solution_value()))
else:
print("The problem does not have an optimal solution.")
Output:
Solution:
Objective value = 210.0
x0 = 40.0
x1 = 60.0
x2 = -0.0
x3 = 20.0

Related

Compute precision and accuracy using numpy

Suppose two lists true_values = [1, 0, 1, 0, 1, 0, 0, 1, 1, 0, 1, 0] and predictions = [1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 0]. How can I compute the accuracy and the precision using numpy?
import numpy as np
true_values = np.array([[1, 0, 1, 0, 1, 0, 0, 1, 1, 0, 1, 0]])
predictions = np.array([[1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 0]])
N = true_values.shape[1]
accuracy = (true_values == predictions).sum() / N
TP = ((predictions == 1) & (true_values == 1)).sum()
FP = ((predictions == 1) & (true_values == 0)).sum()
precision = TP / (TP+FP)
This is the most concise way I came up with (assuming no sklearn), there might be even shorter though!
This is what sklearn, which uses numpy behind the curtain, is for:
from sklearn.metrics import precision_score, accuracy_score
accuracy_score(true_values, predictions), precision_score(true_values, predictions)
Output:
(0.3333333333333333, 0.375)
I'm only going to answer for precision, because I posted a duplicate for accuracy and there should be one question per thread:
sum(map(lambda x, y: x == y == 1, true_values, predictions))/sum(true_values)
0.5
Use np.sum if you absolutely want to use Numpy
Here is for the mean:
np.equal(true_values, predictions).mean()
If you really want to calculate it yourself instead of using a library like in Quang Hoang's answer, just count the number of (true|false) (positives|negatives) and plug the values into your formula:
tp = 0
tn = 0
fp = 0
fn = 0
for t,p in zip(true_values, predictions):
if t == p:
if p == 1:
tp += 1
else:
tn += 1
else:
if p == 1:
fn += 1
else:
fp += 1
accuracy = (tp + tn) / (tp + tn + fp + fn)
precision = tp / (tp + fp)
This wikipedia article has a nice info box with formulas that use exactly these values.

calculate sum of Nth column of numpy array entry grouped by the indices in first two columns?

I would like to loop over following check_matrix in such a way that code recognize whether the first and second element is 1 and 1 or 1 and 2 etc? Then for each separate class of pair i.e. 1,1 or 1,2 or 2,2, the code should store in the new matrices, the sum of last element (which in this case has index 8) times exp(-i*q(check_matrix[k][2:5]-check_matrix[k][5:8])), where i is iota (complex number), k is the running index on check_matrix and q is a vector defined as given below. So there are 20 q vectors.
import numpy as np
q= []
for i in np.linspace(0, 10, 20):
q.append(np.array((0, 0, i)))
q = np.array(q)
check_matrix = np.array([[1, 1, 0, 0, 0, 0, 0, -0.7977, -0.243293],
[1, 1, 0, 0, 0, 0, 0, 1.5954, 0.004567],
[1, 2, 0, 0, 0, -1, 0, 0, 1.126557],
[2, 1, 0, 0, 0, 0.5, 0.86603, 1.5954, 0.038934],
[2, 1, 0, 0, 0, 2, 0, -0.7977, -0.015192],
[2, 2, 0, 0, 0, -0.5, 0.86603, 1.5954, 0.21394]])
This means in principles I will have to have 20 matrices of shape 2x2, corresponding to each q vector.
For the moment my code is giving only one matrix, which appears to be the last one, even though I am appending in the Matrices. My code looks like below,
for i in range(2):
i = i+1
for j in range(2):
j= j +1
j_list = []
Matrices = []
for k in range(len(check_matrix)):
if check_matrix[k][0] == i and check_matrix[k][1] == j:
j_list.append(check_matrix[k][8]*np.exp(-1J*np.dot(q,(np.subtract(check_matrix[k][2:5],check_matrix[k][5:8])))))
j_11 = np.sum(j_list)
I_matrix[i-1][j-1] = j_11
Matrices.append(I_matrix)
I_matrix is defined as below:
I_matrix= np.zeros((2,2),dtype=np.complex_)
At the moment I get following output.
Matrices = [array([[-0.66071446-0.77603624j, -0.29038112+2.34855023j], [-0.31387562-0.08116629j, 4.2788 +0.j ]])]
But, I desire to get a matrix corresponding to each q value meaning that in total there should be 20 matrices in this case, where each 2x2 matrix element would be containing sums such that elements belong to 1,1 and 1,2 and 2,2 pairs in following manner
array([[11., 12.],
[21., 22.]])
I shall highly appreciate your suggestion to correct it. Thanks in advance!
I am pretty sure you can solve this problem in an easier way and I am not 100% sure that I understood you correctly, but here is some code that does what I think you want. If you have a possibility to check if the results are valid, I would suggest you do so.
import numpy as np
n = 20
q = np.zeros((20, 3))
q[:, -1] = np.linspace(0, 10, n)
check_matrix = np.array([[1, 1, 0, 0, 0, 0, 0, -0.7977, -0.243293],
[1, 1, 0, 0, 0, 0, 0, 1.5954, 0.004567],
[1, 2, 0, 0, 0, -1, 0, 0, 1.126557],
[2, 1, 0, 0, 0, 0.5, 0.86603, 1.5954, 0.038934],
[2, 1, 0, 0, 0, 2, 0, -0.7977, -0.015192],
[2, 2, 0, 0, 0, -0.5, 0.86603, 1.5954, 0.21394]])
check_matrix[:, :2] -= 1 # python indexing is zero based
matrices = np.zeros((n, 2, 2), dtype=np.complex_)
for i in range(2):
for j in range(2):
k_list = []
for k in range(len(check_matrix)):
if check_matrix[k][0] == i and check_matrix[k][1] == j:
k_list.append(check_matrix[k][8] *
np.exp(-1J * np.dot(q, check_matrix[k][2:5]
- check_matrix[k][5:8])))
matrices[:, i, j] = np.sum(k_list, axis=0)
NOTE: I changed your indices to have consistent
zero-based indexing.
Here is another approach where I replaced the k-loop with a vectored version:
for i in range(2):
for j in range(2):
k = np.logical_and(check_matrix[:, 0] == i, check_matrix[:, 1] == j)
temp = np.dot(check_matrix[k, 2:5] - check_matrix[k, 5:8], q[:, :, np.newaxis])[..., 0]
temp = check_matrix[k, 8:] * np.exp(-1J * temp)
matrices[:, i, j] = np.sum(temp, axis=0)
3 line solution
You asked for efficient solution in your original title so how about this solution that avoids nested loops and if statements in a 3 liner, which is thus hopefully faster?
fac=2*(check_matrix[:,0]-1)+(check_matrix[:,1]-1)
grp=np.split(check_matrix[:,8], np.cumsum(np.unique(fac,return_counts=True)[1])[:-1])
[np.sum(x) for x in grp]
output:
[-0.23872600000000002, 1.126557, 0.023742000000000003, 0.21394]
How does it work?
I combine the first two columns into a single index, treating each as "bits" (i.e. base 2)
fac=2*(check_matrix[:,0]-1)+(check_matrix[:,1]-1)
( If you have indexes that exceed 2, you can still use this technique but you will need to use a different base to combine the columns. i.e. if your indices go from 1 to 18, you would need to multiply column 0 by a number equal to or larger than 18 instead of 2. )
So the result of the first line is
array([0., 0., 1., 2., 2., 3.])
Note as well it assumes the data is ordered, that one column changes fastest, if this is not the case you will need an extra step to sort the index and the original check matrix. In your example the data is ordered.
The next step groups the data according to the index, and uses the solution posted here.
np.split(check_matrix[:,8], np.cumsum(np.unique(fac,return_counts=True)[1])[:-1])
[array([-0.243293, 0.004567]), array([1.126557]), array([ 0.038934, -0.015192]), array([0.21394])]
i.e. it outputs the 8th column of check_matrix according to the grouping of fac
then the last line simply sums those... knowing how the first two columns were combined to give the single index allows you to map the result back. Or you could simply add it to check matrix as a 9th column if you wanted.

Replace in array of zeros with other values in certain cells_updated question

I need to solve a problem in which I have spent hours, with the data from my excel sheet I have created a 6x36 '' zeros '' matrix of zeros and a 6x6 '' matrix_tran '' coordinate transformation matrix [image 1].
My problem is that I can't find a way to replace the zeros of the '' zeros '' matrix with the values that the matrix '' matrix_tran '' dictates, and whose location must be in the columns (4,5,6, 7,8,9) that are given by the connection vector (4,5,6,7,8,9) of element 15 of the Excel sheet, that is, the last row of the for loop iteration [image 2].
In summary: Below I show how it fits and how it should look [image 3 and 4 respectively].
I would very much appreciate your help, and excuse my English, but it is not my native language, a big greeting.
import pandas as pd
import numpy as np
ex = pd.ExcelFile('matrix_tr.xlsx')
hoja = ex.parse('Hoja1')
cols = 36
for n in range(0,len(hoja)):
A = hoja['ELEMENT #'][n]
B = hoja['1(i)'][n]
C = hoja['2(i)'][n]
D = hoja['3(i)'][n]
E = hoja['1(j)'][n]
F = hoja['2(j)'][n]
G = hoja['3(j)'][n]
H = hoja['X(i)'][n]
I = hoja['Y(i)'][n]
J = hoja['X(j)'][n]
K = hoja['Y(j)'][n]
L = np.sqrt((J-H)**2+(K-I)**2)
lx = (J-H)/L
ly = (K-I)/L
zeros = np.zeros((6, cols))
counters = hoja.loc[:, ["1(i)", "2(i)", "3(i)", "1(j)", "2(j)", "3(j)"]]
for _, i1, i2, i3, j1, j2, j3 in counters.itertuples():
matrix_tran = np.array([[lx, ly, 0, 0, 0, 0],
[-ly, lx, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0],
[0, 0, 0, lx, ly, 0],
[0, 0, 0, -ly, lx, 0],
[0, 0, 0, 0, 0, 1]])
zeros[:, [i1 - 1, i2 - 1, i3 - 1, j1 - 1, j2 - 1 , j3 - 1]] = matrix_tran
Try with a transposed zeros matrix
import pandas as pd
import numpy as np
ex = pd.ExcelFile('c:/tmp/SO/matrix_tr.xlsx')
hoja = ex.parse('Hoja1')
counters = hoja.loc[:, ["1(i)", "2(i)", "3(i)", "1(j)", "2(j)", "3(j)"]]
# zeros matrix transposed
cols = 36
zeros_trans = np.zeros((cols,6))
# last row only
for n in range(14,len(hoja)):
Xi = hoja['X(i)'][n]
Yi = hoja['Y(i)'][n]
Xj = hoja['X(j)'][n]
Yj = hoja['Y(j)'][n]
X = Xj-Xi
Y = Yj-Yi
L = np.sqrt(X**2+Y**2)
lx = X/L
ly = Y/L
matrix_tran = np.array([[lx, ly, 0, 0, 0, 0],
[-ly, lx, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0],
[0, 0, 0, lx, ly, 0],
[0, 0, 0, -ly, lx, 0],
[0, 0, 0, 0, 0, 1]])
i = 0
for r in counters.iloc[n]:
zeros_trans[r-1] = matrix_tran[i]
i += 1
print(np.transpose(zeros_trans))

Discrepancies between R optim vs Scipy optimize: Nelder-Mead

I wrote a script that I believe should produce the same results in Python and R, but they are producing very different answers. Each attempts to fit a model to simulated data by minimizing deviance using Nelder-Mead. Overall, optim in R is performing much better. Am I doing something wrong? Are the algorithms implemented in R and SciPy different?
Python result:
>>> res = minimize(choiceProbDev, sparams, (stim, dflt, dat, N), method='Nelder-Mead')
final_simplex: (array([[-0.21483287, -1. , -0.4645897 , -4.65108495],
[-0.21483909, -1. , -0.4645915 , -4.65114839],
[-0.21485426, -1. , -0.46457789, -4.65107337],
[-0.21483727, -1. , -0.46459331, -4.65115965],
[-0.21484398, -1. , -0.46457725, -4.65099805]]), array([107.46037865, 107.46037868, 107.4603787 , 107.46037875,
107.46037875]))
fun: 107.4603786452194
message: 'Optimization terminated successfully.'
nfev: 349
nit: 197
status: 0
success: True
x: array([-0.21483287, -1. , -0.4645897 , -4.65108495])
R result:
> res <- optim(sparams, choiceProbDev, stim=stim, dflt=dflt, dat=dat, N=N,
method="Nelder-Mead")
$par
[1] 0.2641022 1.0000000 0.2086496 3.6688737
$value
[1] 110.4249
$counts
function gradient
329 NA
$convergence
[1] 0
$message
NULL
I've checked over my code and as far as I can tell this appears to be due to some difference between optim and minimize because the function I'm trying to minimize (i.e., choiceProbDev) operates the same in each (besides the output, I've also checked the equivalence of each step within the function). See for example:
Python choiceProbDev:
>>> choiceProbDev(np.array([0.5, 0.5, 0.5, 3]), stim, dflt, dat, N)
143.31438613033876
R choiceProbDev:
> choiceProbDev(c(0.5, 0.5, 0.5, 3), stim, dflt, dat, N)
[1] 143.3144
I've also tried to play around with the tolerance levels for each optimization function, but I'm not entirely sure how the tolerance arguments match up between the two. Either way, my fiddling so far hasn't brought the two into agreement. Here is the entire code for each.
Python:
# load modules
import math
import numpy as np
from scipy.optimize import minimize
from scipy.stats import binom
# initialize values
dflt = 0.5
N = 1
# set the known parameter values for generating data
b = 0.1
w1 = 0.75
w2 = 0.25
t = 7
theta = [b, w1, w2, t]
# generate stimuli
stim = np.array(np.meshgrid(np.arange(0, 1.1, 0.1),
np.arange(0, 1.1, 0.1))).T.reshape(-1,2)
# starting values
sparams = [-0.5, -0.5, -0.5, 4]
# generate probability of accepting proposal
def choiceProb(stim, dflt, theta):
utilProp = theta[0] + theta[1]*stim[:,0] + theta[2]*stim[:,1] # proposal utility
utilDflt = theta[1]*dflt + theta[2]*dflt # default utility
choiceProb = 1/(1 + np.exp(-1*theta[3]*(utilProp - utilDflt))) # probability of choosing proposal
return choiceProb
# calculate deviance
def choiceProbDev(theta, stim, dflt, dat, N):
# restrict b, w1, w2 weights to between -1 and 1
if any([x > 1 or x < -1 for x in theta[:-1]]):
return 10000
# initialize
nDat = dat.shape[0]
dev = np.array([np.nan]*nDat)
# for each trial, calculate deviance
p = choiceProb(stim, dflt, theta)
lk = binom.pmf(dat, N, p)
for i in range(nDat):
if math.isclose(lk[i], 0):
dev[i] = 10000
else:
dev[i] = -2*np.log(lk[i])
return np.sum(dev)
# simulate data
probs = choiceProb(stim, dflt, theta)
# randomly generated data based on the calculated probabilities
# dat = np.random.binomial(1, probs, probs.shape[0])
dat = np.array([0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1,
0, 1, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1,
0, 0, 1, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1,
0, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1,
0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1])
# fit model
res = minimize(choiceProbDev, sparams, (stim, dflt, dat, N), method='Nelder-Mead')
R:
library(tidyverse)
# initialize values
dflt <- 0.5
N <- 1
# set the known parameter values for generating data
b <- 0.1
w1 <- 0.75
w2 <- 0.25
t <- 7
theta <- c(b, w1, w2, t)
# generate stimuli
stim <- expand.grid(seq(0, 1, 0.1),
seq(0, 1, 0.1)) %>%
dplyr::arrange(Var1, Var2)
# starting values
sparams <- c(-0.5, -0.5, -0.5, 4)
# generate probability of accepting proposal
choiceProb <- function(stim, dflt, theta){
utilProp <- theta[1] + theta[2]*stim[,1] + theta[3]*stim[,2] # proposal utility
utilDflt <- theta[2]*dflt + theta[3]*dflt # default utility
choiceProb <- 1/(1 + exp(-1*theta[4]*(utilProp - utilDflt))) # probability of choosing proposal
return(choiceProb)
}
# calculate deviance
choiceProbDev <- function(theta, stim, dflt, dat, N){
# restrict b, w1, w2 weights to between -1 and 1
if (any(theta[1:3] > 1 | theta[1:3] < -1)){
return(10000)
}
# initialize
nDat <- length(dat)
dev <- rep(NA, nDat)
# for each trial, calculate deviance
p <- choiceProb(stim, dflt, theta)
lk <- dbinom(dat, N, p)
for (i in 1:nDat){
if (dplyr::near(lk[i], 0)){
dev[i] <- 10000
} else {
dev[i] <- -2*log(lk[i])
}
}
return(sum(dev))
}
# simulate data
probs <- choiceProb(stim, dflt, theta)
# same data as in python script
dat <- c(0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1,
0, 1, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1,
0, 0, 1, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1,
0, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1,
0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)
# fit model
res <- optim(sparams, choiceProbDev, stim=stim, dflt=dflt, dat=dat, N=N,
method="Nelder-Mead")
UPDATE:
After printing the estimates at each iteration, it now appears to me that the discrepancy might stem from differences in 'step sizes' that each algorithm takes. Scipy appears to take smaller steps than optim (and in a different initial direction). I haven't figured out how to adjust this.
Python:
>>> res = minimize(choiceProbDev, sparams, (stim, dflt, dat, N), method='Nelder-Mead')
[-0.5 -0.5 -0.5 4. ]
[-0.525 -0.5 -0.5 4. ]
[-0.5 -0.525 -0.5 4. ]
[-0.5 -0.5 -0.525 4. ]
[-0.5 -0.5 -0.5 4.2]
[-0.5125 -0.5125 -0.5125 3.8 ]
...
R:
> res <- optim(sparams, choiceProbDev, stim=stim, dflt=dflt, dat=dat, N=N, method="Nelder-Mead")
[1] -0.5 -0.5 -0.5 4.0
[1] -0.1 -0.5 -0.5 4.0
[1] -0.5 -0.1 -0.5 4.0
[1] -0.5 -0.5 -0.1 4.0
[1] -0.5 -0.5 -0.5 4.4
[1] -0.3 -0.3 -0.3 3.6
...
This isn't exactly an answer of "what are the optimizer differences", but I want to contribute some exploration of the optimization problem here. A few take-home points:
the surface is smooth, so derivative-based optimizers might work better (even without an explicitly coded gradient function, i.e. falling back on finite difference approximation - they'd be even better with a gradient function)
this surface is symmetric, so it has multiple optima (apparently two), but it's not highly multimodal or rough, so I don't think a stochastic global optimizer would be worth the trouble
for optimization problems that aren't too high-dimensional or expensive to compute, it's feasible to visualize the global surface to understand what's going on.
for optimization with bounds, it's generally better either to use an optimizer that explicitly handles bounds, or to change the scale of parameters to an unconstrained scale
Here's a picture of the whole surface:
The red contours are the contours of log-likelihood equal to (110, 115, 120) (the best fit I could get was LL=105.7). The best points are in the second column, third row (achieved by L-BFGS-B) and fifth column, fourth row (true parameter values). (I haven't inspected the objective function to see where the symmetries come from, but I think it would probably be clear.) Python's Nelder-Mead and R's Nelder-Mead do approximately equally badly.
parameters and problem setup
## initialize values
dflt <- 0.5; N <- 1
# set the known parameter values for generating data
b <- 0.1; w1 <- 0.75; w2 <- 0.25; t <- 7
theta <- c(b, w1, w2, t)
# generate stimuli
stim <- expand.grid(seq(0, 1, 0.1), seq(0, 1, 0.1))
# starting values
sparams <- c(-0.5, -0.5, -0.5, 4)
# same data as in python script
dat <- c(0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1,
0, 1, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1,
0, 0, 1, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1,
0, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1,
0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)
objective functions
Note use of built-in functions (plogis(), dbinom(...,log=TRUE) where possible.
# generate probability of accepting proposal
choiceProb <- function(stim, dflt, theta){
utilProp <- theta[1] + theta[2]*stim[,1] + theta[3]*stim[,2] # proposal utility
utilDflt <- theta[2]*dflt + theta[3]*dflt # default utility
choiceProb <- plogis(theta[4]*(utilProp - utilDflt)) # probability of choosing proposal
return(choiceProb)
}
# calculate deviance
choiceProbDev <- function(theta, stim, dflt, dat, N){
# restrict b, w1, w2 weights to between -1 and 1
if (any(theta[1:3] > 1 | theta[1:3] < -1)){
return(10000)
}
## for each trial, calculate deviance
p <- choiceProb(stim, dflt, theta)
lk <- dbinom(dat, N, p, log=TRUE)
return(sum(-2*lk))
}
# simulate data
probs <- choiceProb(stim, dflt, theta)
model fitting
# fit model
res <- optim(sparams, choiceProbDev, stim=stim, dflt=dflt, dat=dat, N=N,
method="Nelder-Mead")
## try derivative-based, box-constrained optimizer
res3 <- optim(sparams, choiceProbDev, stim=stim, dflt=dflt, dat=dat, N=N,
lower=c(-1,-1,-1,-Inf), upper=c(1,1,1,Inf),
method="L-BFGS-B")
py_coefs <- c(-0.21483287, -0.4645897 , -1, -4.65108495) ## transposed?
true_coefs <- c(0.1, 0.25, 0.75, 7) ## transposed?
## start from python coeffs
res2 <- optim(py_coefs, choiceProbDev, stim=stim, dflt=dflt, dat=dat, N=N,
method="Nelder-Mead")
explore log-likelihood surface
cc <- expand.grid(seq(-1,1,length.out=51),
seq(-1,1,length.out=6),
seq(-1,1,length.out=6),
seq(-8,8,length.out=51))
## utility function for combining parameter values
bfun <- function(x,grid_vars=c("Var2","Var3"),grid_rng=seq(-1,1,length.out=6),
type=NULL) {
if (is.list(x)) {
v <- c(x$par,x$value)
} else if (length(x)==4) {
v <- c(x,NA)
}
res <- as.data.frame(rbind(setNames(v,c(paste0("Var",1:4),"z"))))
for (v in grid_vars)
res[,v] <- grid_rng[which.min(abs(grid_rng-res[,v]))]
if (!is.null(type)) res$type <- type
res
}
resdat <- rbind(bfun(res3,type="R_LBFGSB"),
bfun(res,type="R_NM"),
bfun(py_coefs,type="Py_NM"),
bfun(true_coefs,type="true"))
cc$z <- apply(cc,1,function(x) choiceProbDev(unlist(x), dat=dat, stim=stim, dflt=dflt, N=N))
library(ggplot2)
library(viridisLite)
ggplot(cc,aes(Var1,Var4,fill=z))+
geom_tile()+
facet_grid(Var2~Var3,labeller=label_both)+
scale_fill_viridis_c()+
scale_x_continuous(expand=c(0,0))+
scale_y_continuous(expand=c(0,0))+
theme(panel.spacing=grid::unit(0,"lines"))+
geom_contour(aes(z=z),colour="red",breaks=seq(105,120,by=5),alpha=0.5)+
geom_point(data=resdat,aes(colour=type,shape=type))+
scale_colour_brewer(palette="Set1")
ggsave("liksurf.png",width=8,height=8)
'Nelder-Mead' has always been a problematic optimization method, and its coding in optim is not up-to-date. We will try three other implementations available in R packages.
To avoild the other parameters, let's define function fn as
fn <- function(theta)
choiceProbDev(theta, stim=stim, dflt=dflt, dat=dat, N=N)
Then the solvers dfoptim::nmk(), adagio::neldermead(), and pracma::anms() will all return the same minimum value xmin = 105.7843, but at different positions, for instance
dfoptim::nmk(sparams, fn)
## $par
## [1] 0.1274937 0.6671353 0.1919542 8.1731618
## $value
## [1] 105.7843
These are real local minima while, for example, the Python solution 107.46038 at c(-0.21483287,-1.0,-0.4645897,-4.65108495) is not. Your problem data are obviously not sufficient for fitting the model.
You might try a global optimizer to possibly find a global optimum within certain bounds. To me it looks like all local minima have the same minimum value.

Solving linear regression equation for a sparse matrix

I'd like to solve a multivariate linear regression equation for vector X with m elements while I have n observations, Y. If I assume the measurements have Gaussian random errors. How can I solve this problem using python? My problem looks like this:
A simple example of W when m=5, is given as follows:
P.S. I would like to consider the effect of errors and precisely I want to measure the standard deviation of errors.
You can do it like this
def myreg(W, Y):
from numpy.linalg import pinv
m, n = Y.shape
k = W.shape[1]
X = pinv(W.T.dot(W)).dot(W.T).dot(Y)
Y_hat = W.dot(X)
Residuals = Y_hat - Y
MSE = np.square(Residuals).sum(axis=0) / (m - 2)
X_var = (MSE[:, None] * pinv(W.T.dot(W)).flatten()).reshape(n, k, k)
Tstat = X / np.sqrt(X_var.diagonal(axis1=1, axis2=2)).T
return X, Tstat
demo
W = np.array([
[ 1, -1, 0, 0, 1],
[ 0, -1, 1, 0, 0],
[ 1, 0, 0, 1, 0],
[ 0, 1, -1, 0, 1],
[ 1, -1, 0, 0, -1],
])
Y = np.array([2, 4, 1, 5, 3])[:, None]
X, V = myreg(W, Y)

Categories

Resources