I have a large M x N 2D matrix of positive coefficients (8bit unsigned) with M,N~10^3.
I want to optimize the parameters (M*N~10^6) such that my error-function (where I put the matrix in) is minimized. Boundary conditions: Neighboring parameters vary slowly/smoothly.
I have already tried to downscale the matrix in order to reduce the number of parameters and then flatten the matrix to feed it into Scipy's Minimize function. This causes memory errors quickly.
Is there a smart approch to optimize this large coefficient matrix in time ranges less than infinity without applying a low-parametric model?
How can I turn this log likelihood 2D matrix format equation into a 1D format array to use the scipy.minimize on it? The log likelihood function is as follows:
term = A#(X.shift(1).fillna(0))-X
min_fun = (term.T)#np.diag(mat)#term
where X is a time series format known 2D array (MxT), A is square 2d array (MxM) which I want to estimate the elements of and np.diag(mat) is a column vector of length M. Note that the problem is high dimension so I end up with many equations and am looing for the best way to make this parameter estimation into a 1D equation format.
I'm trying to fit a gaussian to this set of data:
It is a 2D matrix with values (probability distribution). If I plot it in 3D it looks like:
As far as I understood from this other question (https://mathematica.stackexchange.com/questions/27642/fitting-a-two-dimensional-gaussian-to-a-set-of-2d-pixels) I need to compute the mean and the covariance matrix of my data and the Gaussian that I need will be exactly the one defined by that mean and covariance matrix.
However, I can not properly understand the code of that other question (as it is from Mathematica) and I am pretty stuck with statistics.
How would I compute in Python (Numpy, PyTorch...) the mean and the covariance matrix of the Gaussian?
I'm trying to avoid all these optimization frameworks (LSQ, KDE) as I think that the solution is much simpler and the computational cost is something that I have to take into account...
Thanks!
Let's call your data matrix D with shape d x n where d is the data dimension and n is the number of samples. I will assume that in your example, d=5 and n=6, although you will need to determine for yourself which is the data dimension and which is the sample dimension. In that case, we can find the mean and covariance using the following code:
import numpy as np
n = 6
d = 5
D = np.random.random([d, n])
mean = D.mean(axis=1)
covariance = np.cov(D)
I'm trying to understand logistic and linear regression and was able to understand the theory behind it (doing andrew ng course).
We have X -> given features -> matrix of (m , n+1) where m - no. of cases and n- features given (excluding x0)
We have y - > the label to predict -> matrix of (m,1)
Now while I'm implementing it from scratch in python, I'm confused as to why we use transpose of theta in the sigmoid function.
Also we use theta transpose X for linear regression too.
We do not have to perform matrix multiplication anywhere while coding, its straight element to element coding, what's the need for the transpose or is my understanding wrong and we need to take matrix multiplication during implementation.
My main concern is that I'm very confused as to where we do matrix multiplication and where we do element wise multiplication in logistic and linear regression
You are a bit off topic for this area, but the piece you appear to be hung up on is the treatment of x and Theta.
In the use cases you describe, x is a vector of inputs, or the "feature vector". The Theta vector is the vector of coefficients. Both are usually expressed as column vectors and of course, must be of the same dimension.
So to "make a prediction" you need the inner product of these two, and the output needs to be a scalar (by definition for inner product) so you need to transpose the theta vector in order to properly express that operation, which is a matrix multiplication of two vectors. Make sense?
For matrix multiplication, the number of Columns in the first element must equal the number of rows in the second element. Since one of the elements your multiplying has either one column or one row, it does not appear to be matrix multiplication due to it's simplicity. But it still is matrix multiplication
Let me provide an example,
Let A be (m,n) matrix
We can perform scalar multiplication, for some fixed a in the real numbers
If we want to multiply A to some vector, x, we need to meet some restrictions. Here it is common to mistake the dot product for matrix multiplication, but they serve completely different purposes.
So our restrictions for multiplying an (m,n) matrix, A by a vector x is that x has the same number of entries as A has columns To do this in your example, one of the elements needed to be transposed.
I know that numpy.cov calculates the covariance given a N dimensional array.
I can see from the documentation on GitHub that the normalisation is done by (N-1). But for my specific case, the covariance matrix is given by:
where xi is the quantity. i and j are the bins.
As you can see from the above equation, this covariance matrix is normalised by (N-1)/N.
TO GET THE ABOVE NORMALISATION
Can I simply multiply the covariance matrix obtained from numpy.cov by (N-1)**2 / N to get the above normalisation? Is that correct?
Or Should I use the bias parameter inside numpy.cov? If so how?
There are two ways of doing this.
We can call np.cov with bias=1 and then multiply the result by N-1
or
We can multiply the overall covariance matrix obtained by (N-1)**2/N