One of the hallmarks of factor analysis is that it allows for non-orthogonal latent variables.
In R for example this feature is accessible via the rotation parameter of factanal.
Is there any such provision for sklearn.decomposition.FactorAnalysis? Clearly it's not among the arguments - but maybe there is another way to achieve this?
Sadly I have been unable to find many examples of usage for this function.
Interesting question. I think there are no rotations implemented indeed - see this issue.
Maybe this implementation is what you are looking for.
It appears this is now implemented.
Example: https://scikit-learn.org/stable/auto_examples/decomposition/plot_varimax_fa.html
rotation{‘varimax’, ‘quartimax’}, default=None
If not None, apply the indicated rotation. Currently, varimax and
quartimax are implemented. See “The varimax criterion for analytic
rotation in factor analysis” H. F. Kaiser, 1958.
New in version 0.24.
Source: https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.FactorAnalysis.html
Related
I'm trying to fit my data and have so far used sp.optimize.leastsq. I changed to sp.optimize.least_squares to add bounds to the parameters, but both when I use bounds and when I don't the search doesn't converge, even in data sets sp.optimize.leastsq fitted just fine.
Shouldn't these functions work the same?
What could be the difference between them that makes the newer one not to find solutions the older one did?
leastsq
is a wrapper around MINPACK’s lmdif and lmder algorithms.
least_squares implements other methods in addition to the MINPACK algorithm.
method{‘trf’, ‘dogbox’, ‘lm’}, optional
Algorithm to perform minimization.
‘trf’ : Trust Region Reflective algorithm, particularly suitable for large sparse problems with bounds. Generally robust method.
‘dogbox’ : dogleg algorithm with rectangular trust regions, typical use case is small problems with bounds. Not recommended for problems with rank-deficient Jacobian.
‘lm’ : Levenberg-Marquardt algorithm as implemented in MINPACK. Doesn’t handle bounds and sparse Jacobians. Usually the most efficient method for small unconstrained problems.
Default is ‘trf’. See Notes for more information.
It is possible for some problems that lm method does not converge while trf converges.
Does anyone know how to feed in an initial solution or matrix of initial solutions into the differential evolution function from the Scipy library?
The documentation doesn't explain if its possible but I know that initial solution implementation is not unusual. Scipy is so widely used I would expect it to have that type of functionality.
Ok, after review and testing I believe I now understand it.
There are a set of parameters that the scipy.optimize.differential_evolution(...) function can accept, one is the init parameter which allows you to upload an array of solutions. Personally I was looking at a set of coordinates so enumerated them into an array and fed in 99 other variations of it (100 different solutions) and fed this matrix into the inti parameter. I believe it needs to have more than 4 solutions or your are going to get a tuple error.
I probably didn't need to ask/answer the question though it may help others that got equally confused.
If I have a vector space spanned by five vectors v1....v5, to find the orthogonal basis for A where A=[v1,v2...v5] and A is 5Xn
should I use np.linalg.qr(A) or scipy.linalg.orth(A)??
Thanks in advance
Note that sp.linalg.orth uses the SVD while np.linalg.qr uses a QR factorization. Both factorizations are obtained via wrappers for LAPACK functions.
I don't think there is a strong preference for one over the other. The SVD will be slightly more stable but also a bit slower to compute. In practice I don't think you will really see much of a difference.
You'll want to use:
scipy.linalg.orth(A)
The generally accepted rule is to use scipy.linalg - because it covers more functionality than np.linalg. Hope that helps!
Usually I use Mathematica, but now trying to shift to python, so this question might be a trivial one, so I am sorry about that.
Anyways, is there any built-in function in python which is similar to the function named Interval[{min,max}] in Mathematica ? link is : http://reference.wolfram.com/language/ref/Interval.html
What I am trying to do is, I have a function and I am trying to minimize it, but it is a constrained minimization, by that I mean, the parameters of the function are only allowed within some particular interval.
For a very simple example, lets say f(x) is a function with parameter x and I am looking for the value of x which minimizes the function but x is constrained within an interval (min,max) . [ Obviously the actual problem is just not one-dimensional rather multi-dimensional optimization, so different paramters may have different intervals. ]
Since it is an optimization problem, so ofcourse I do not want to pick the paramter randomly from an interval.
Any help will be highly appreciated , thanks!
If it's a highly non-linear problem, you'll need to use an algorithm such as the Generalized Reduced Gradient (GRG) Method.
The idea of the generalized reduced gradient algorithm (GRG) is to solve a sequence of subproblems, each of which uses a linear approximation of the constraints. (Ref)
You'll need to ensure that certain conditions known as the KKT conditions are met, etc. but for most continuous problems with reasonable constraints, you'll be able to apply this algorithm.
This is a good reference for such problems with a few examples provided. Ref. pg. 104.
Regarding implementation:
While I am not familiar with Python, I have built solver libraries in C++ using templates as well as using function pointers so you can pass on functions (for the objective as well as constraints) as arguments to the solver and you'll get your result - hopefully in polynomial time for convex problems or in cases where the initial values are reasonable.
If an ability to do that exists in Python, it shouldn't be difficult to build a generalized GRG solver.
The Python Solution:
Edit: Here is the python solution to your problem: Python constrained non-linear optimization
I'm trying to perform a constrained least-squares estimation using Scipy such that all of the coefficients are in the range (0,1) and sum to 1 (this functionality is implemented in Matlab's LSQLIN function).
Does anybody have tips for setting up this calculation using Python/Scipy. I believe I should be using scipy.optimize.fmin_slsqp(), but am not entirely sure what parameters I should be passing to it.[1]
Many thanks for the help,
Nick
[1] The one example in the documentation for fmin_slsqp is a bit difficult for me to parse without the referenced text -- and I'm new to using Scipy.
scipy-optimize-leastsq-with-bound-constraints on SO givesleastsq_bounds, which is
leastsq
with bound constraints such as 0 <= x_i <= 1.
The constraint that they sum to 1 can be added in the same way.
(I've found leastsq_bounds / MINPACK to be good on synthetic test functions in 5d, 10d, 20d;
how many variables do you have ?)
Have a look at this tutorial, it seems pretty clear.
Since MATLAB's lsqlin is a bounded linear least squares solver, you would want to check out scipy.optimize.lsq_linear.
Non-negative least squares optimization using scipy.optimize.nnls is a robust way of doing it. Note that, if the coefficients are constrained to be positive and sum to unity, they are automatically limited to interval [0,1], that is one need not additionally constrain them from above.
scipy.optimize.nnls automatically makes variables positive using Lawson and Hanson algorithm, whereas the sum constraint can be taken care of as discussed in this thread and this one.
Scipy nnls uses an old fortran backend, which is apparently widely used in equivalent implementations of nnls by other software.