Translating mathematical functions from MATLAB to Python - python

I am currently working on a project which involves translating a program which runs in MATLAB to Python to increase speed and efficiency. However, I have hit a stumbling block. First, I am confused as to what the tilde(~) indicates in MATLAB, and how to represent that in a corresponding way in python. Second, I have been searching through documentation and I'm also having a difficult time finding an equivalent function for the 'sign' function in MATLAB.
indi = ~abs(indexd);
wav = (sum(sum(wv)))/(length(wv)*(length(wv)-1));
thetau = (sign(sign(wv - wav) - 0.1) + 1)/2;
thetad = (sign(sign(wav - wv) - 0.1) + 1)/2;
I have already converted indexd, and wv, which are from a previous section of code, to numpy arrays. What is the most efficient Pythonic way to replace the ~ and sign functions?

If you're using numpy, then you also use ~ to invert things just like MATLAB. See: What does the unary operator ~ do in numpy?. The sign function also exists in numpy. You use numpy.sign. Therefore, the code above is simply:
>>> import numpy as np
>>> indi = ~np.abs(indexd)
>>> wav = (np.sum(wv))/(len(wv)*(len(wv)-1))
>>> thetau = (np.sign(np.sign(wv - wav) - 0.1) + 1)/2
>>> thetad = (np.sign(np.sign(wav - wv) - 0.1) + 1)/2
Be advised that using length in MATLAB on matrices finds the largest dimension in the matrix whereas numpy uses len to give you the total number of rows in the matrix. Assuming that the number of rows in wv is greater than or equal to the number of columns in wv, then the above code will work as you expect. However, if you have more columns than rows, that you'd need is to find the maximum of the dimensions and use that instead... so:
>>> import numpy as np
>>> maxdim = np.max(wv.shape)
>>> indi = ~np.abs(indexd)
>>> wav = (np.sum(wv))/(maxdim*(maxdim-1))
>>> thetau = (np.sign(np.sign(wv - wav) - 0.1) + 1)/2
>>> thetad = (np.sign(np.sign(wav - wv) - 0.1) + 1)/2
The above call to numpy.sum actually sums over all of the dimensions by default, so there's no need to call nested sum calls to sum over the entire matrix (Thanks Divakar!).
Totally recommend you go here and see the awesome table and guide from translating from MATLAB to numpy: Link

Related

Right matrix division in Scipy/NumPy? [duplicate]

I have this line of MATLAB code:
a/b
I am using these inputs:
a = [1,2,3,4,5,6,7,8,9,1,2,3,4,5,6,7,8,9]
b = ones(25, 18)
This is the result (a 1x25 matrix):
[5,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
What is MATLAB doing? I am trying to duplicate this behavior in Python, and the mrdivide documentation in MATLAB was unhelpful. Where does the 5 come from, and why are the rest of the values 0?
I have tried this with other inputs and receive similar results, usually just a different first element and zeros filling the remainder of the matrix. In Python when I use linalg.lstsq(b.T,a.T), all of the values in the first matrix returned (i.e. not the singular one) are 0.2. I have already tried right division in Python and it gives something completely off with the wrong dimensions.
I understand what a least square approximation is, I just need to know what mrdivide is doing.
Related:
Array division- translating from MATLAB to Python
MRDIVIDE or the / operator actually solves the xb = a linear system, as opposed to MLDIVIDE or the \ operator which will solve the system bx = a.
To solve a system xb = a with a non-symmetric, non-invertible matrix b, you can either rely on mridivide(), which is done via factorization of b with Gauss elimination, or pinv(), which is done via Singular Value Decomposition, and zero-ing of the singular values below a (default) tolerance level.
Here is the difference (for the case of mldivide): What is the difference between PINV and MLDIVIDE when I solve A*x=b?
When the system is overdetermined, both algorithms provide the
same answer. When the system is underdetermined, PINV will return the
solution x, that has the minimum norm (min NORM(x)). MLDIVIDE will
pick the solution with least number of non-zero elements.
In your example:
% solve xb = a
a = [1,2,3,4,5,6,7,8,9,1,2,3,4,5,6,7,8,9];
b = ones(25, 18);
the system is underdetermined, and the two different solutions will be:
x1 = a/b; % MRDIVIDE: sparsest solution (min L0 norm)
x2 = a*pinv(b); % PINV: minimum norm solution (min L2)
>> x1 = a/b
Warning: Rank deficient, rank = 1, tol = 2.3551e-014.
ans =
5.0000 0 0 ... 0
>> x2 = a*pinv(b)
ans =
0.2 0.2 0.2 ... 0.2
In both cases the approximation error of xb-a is non-negligible (non-exact solution) and the same, i.e. norm(x1*b-a) and norm(x2*b-a) will return the same result.
What is MATLAB doing?
A great break-down of the algorithms (and checks on properties) invoked by the '\' operator, depending upon the structure of matrix b is given in this post in scicomp.stackexchange.com. I am assuming similar options apply for the / operator.
For your example, MATLAB is most probably doing a Gaussian elimination, giving the sparsest solution amongst a infinitude (that's where the 5 comes from).
What is Python doing?
Python, in linalg.lstsq uses pseudo-inverse/SVD, as demonstrated above (that's why you get a vector of 0.2's). In effect, the following will both give you the same result as MATLAB's pinv():
from numpy import *
a = array([1,2,3,4,5,6,7,8,9,1,2,3,4,5,6,7,8,9])
b = ones((25, 18))
# xb = a: solve b.T x.T = a.T instead
x2 = linalg.lstsq(b.T, a.T)[0]
x2 = dot(a, linalg.pinv(b))
TL;DR: A/B = np.linalg.solve(B.conj().T, A.conj().T).conj().T
I did not find the earlier answers to create a satisfactory substitute, so I dug into Matlab's reference documents for mrdivide further and found the solution. I cannot explain the actual mathematics here or take credit for coming up with the answer. I'm just following Matlab's explanation. Additionally, I wanted to post the actual detail from Matlab to give credit. If it's a copyright issue, someone tell me and I'll remove the actual text.
%/ Slash or right matrix divide.
% A/B is the matrix division of B into A, which is roughly the
% same as A*INV(B) , except it is computed in a different way.
% More precisely, A/B = (B'\A')'. See MLDIVIDE for details.
%
% C = MRDIVIDE(A,B) is called for the syntax 'A / B' when A or B is an
% object.
%
% See also MLDIVIDE, RDIVIDE, LDIVIDE.
% Copyright 1984-2005 The MathWorks, Inc.
Note that the ' symbol indicates the complex conjugate transpose. In python using numpy, that requires .conj().T chained together.
Per this handy "cheat sheet" of numpy for matlab users, linalg.lstsq(b,a) -- linalg is numpy.linalg.linalg, a light-weight version of the full scipy.linalg.
a/b finds the least square solution to the system of linear equations bx = a
if b is invertible, this is a*inv(b), but if it isn't, the it is the x which minimises norm(bx-a)
You can read more about least squares on wikipedia.
according to matlab documentation, mrdivide will return at most k non-zero values, where k is the computed rank of b. my guess is that matlab in your case solves the least squares problem given by replacing b by b(:1) (which has the same rank). In this case the moore-penrose inverse b2 = b(1,:); inv(b2*b2')*b2*a' is defined and gives the same answer

MATLAB matrix power algorithm

I'm looking to port an algorithm from MATLAB to Python. One step in said algorithm involves taking A^(-1/2) where A is a 9x9 square complex matrix. As I understand it, the square root of matrices (and by extension their inverses) are not-unique.
I've been experimenting with scipy.linalg.fractional_matrix_power and an approximation using A^(-1/2) = exp((-1/2)*log(A)) with numpy's built in expm and logm functions. The former is exceptionally poor and only provides 3 decimal places of precision whereas the latter is decently correct for elements in the top left corner but gets progressively worse as you move down and to the right. This may or may not be a perfectly valid mathematical solution to the expression however it doesn't suffice for this application.
As a result, I'm looking to directly implement MATLAB's matrix power algorithm in Python so that I can 100% confirm the same result each time. Does anyone have any insight or documentation on how this would work? The more parallelizable this algorithm is, the better, as eventually the goal would be to rewrite it in OpenCL for GPU acceleration.
EDIT: An MCVE as requested:
[[(0.591557294607941+4.33680868994202e-19j), (-0.219707725574605-0.35810724986609j), (-0.121305654177909+0.244558388829046j), (0.155552026648172-0.0180264818714123j), (-0.0537690384136066-0.0630740244116577j), (-0.0107526931263697+0.0397896274845627j), (0.0182892503609312-0.00653264433724856j), (-0.00710188853532244-0.0050445035279044j), (-2.20414002823034e-05+0.00373184532662288j)], [(-0.219707725574605+0.35810724986609j), (0.312038814492119+2.16840434497101e-19j), (-0.109433401402399-0.174379997015402j), (-0.0503362231078033+0.108510948023091j), (0.0631826956936223-0.00992931123813742j), (-0.0219902325360141-0.0233215237172002j), (-0.00314837555001163+0.0148621558916679j), (0.00630295247506065-0.00266790359447072j), (-0.00249343102520442-0.00156160619280611j)], [(-0.121305654177909-0.244558388829046j), (-0.109433401402399+0.174379997015402j), (0.136649392858215-1.76182853028894e-19j), (-0.0434623984527311-0.0669251299161109j), (-0.0168737559719828+0.0393768358149159j), (0.0211288536117387-0.00417146769324491j), (-0.00734306979471257-0.00712443264825166j), (-0.000742681625102133+0.00455752452374196j), (0.00179068247786595-0.000862706240042082j)], [(0.155552026648172+0.0180264818714123j), (-0.0503362231078033-0.108510948023091j), (-0.0434623984527311+0.0669251299161109j), (0.0467980890488569+5.14996031930615e-19j), (-0.0140208255975664-0.0209483313237692j), (-0.00472995448413803+0.0117916398375124j), (0.00589653974090387-0.00134198920550751j), (-0.00202109265416585-0.00184021636458858j), (-0.000150793859056431+0.00116822322464066j)], [(-0.0537690384136066+0.0630740244116577j), (0.0631826956936223+0.00992931123813742j), (-0.0168737559719828-0.0393768358149159j), (-0.0140208255975664+0.0209483313237692j), (0.0136137125669776-2.03287907341032e-20j), (-0.00387854073283377-0.0056769786724813j), (-0.0011741038702424+0.00306007798625676j), (0.00144000687517355-0.000355251914809693j), (-0.000481433965262789-0.00042129815655098j)], [(-0.0107526931263697-0.0397896274845627j), (-0.0219902325360141+0.0233215237172002j), (0.0211288536117387+0.00417146769324491j), (-0.00472995448413803-0.0117916398375124j), (-0.00387854073283377+0.0056769786724813j), (0.00347771689075251+8.21621958836671e-20j), (-0.000944046302699304-0.00136521328407881j), (-0.00026318475762475+0.000704212317211994j), (0.00031422288569727-8.10033316327328e-05j)], [(0.0182892503609312+0.00653264433724856j), (-0.00314837555001163-0.0148621558916679j), (-0.00734306979471257+0.00712443264825166j), (0.00589653974090387+0.00134198920550751j), (-0.0011741038702424-0.00306007798625676j), (-0.000944046302699304+0.00136521328407881j), (0.000792908166233942-7.41153828847513e-21j), (-0.00020531962049495-0.000294952695922854j), (-5.36226164765808e-05+0.000145645628243286j)], [(-0.00710188853532244+0.00504450352790439j), (0.00630295247506065+0.00266790359447072j), (-0.000742681625102133-0.00455752452374196j), (-0.00202109265416585+0.00184021636458858j), (0.00144000687517355+0.000355251914809693j), (-0.00026318475762475-0.000704212317211994j), (-0.00020531962049495+0.000294952695922854j), (0.000162971629601464-5.39321759384574e-22j), (-4.03304806590714e-05-5.77159110863666e-05j)], [(-2.20414002823034e-05-0.00373184532662288j), (-0.00249343102520442+0.00156160619280611j), (0.00179068247786595+0.000862706240042082j), (-0.000150793859056431-0.00116822322464066j), (-0.000481433965262789+0.00042129815655098j), (0.00031422288569727+8.10033316327328e-05j), (-5.36226164765808e-05-0.000145645628243286j), (-4.03304806590714e-05+5.77159110863666e-05j), (3.04302590501313e-05-4.10281583826302e-22j)]]
I can think of two explanations, in both cases I accuse user error. In chronological order:
Theory #1 (the subtle one)
My suspicion is that you're copying the printed values of the input matrix from one code as input into the other. I.e. you're throwing away double precision when you switch codes, which gets amplified during the inverse-square-root calculation.
As proof, I compared MATLAB's inverse square root with the very function you're using in python. I will show a 3x3 example due to size considerations, but—spoiler warning—I did the same with a 9x9 random matrix and got two results with condition number 11.245754109790719 (MATLAB) and 11.245754109790818 (numpy). That should tell you something about the similarity of the results without having to save and load the actual matrices between the two codes. I suggest you do this though: keywords are scipy.io.loadmat and savemat.
What I did was generate the random data in python (because that's what I prefer):
>>> import numpy as np
>>> print((np.random.rand(3,3) + 1j*np.random.rand(3,3)).tolist())
[[(0.8404782758300281+0.29389006737780765j), (0.741574080512219+0.7944606900644321j), (0.12788250870304718+0.37304665786925073j)], [(0.8583402784463595+0.13952117266781894j), (0.2138809231406249+0.6233427148017449j), (0.7276466404131303+0.6480559739625379j)], [(0.1784816129006297+0.72452362541158j), (0.2870462766764591+0.8891190037142521j), (0.0980355896905617+0.03022344706473823j)]]
By copying the same truncated output into both codes, I guarantee the correspondence of the inputs.
Example in MATLAB:
>> M = [[(0.8404782758300281+0.29389006737780765j), (0.741574080512219+0.7944606900644321j), (0.12788250870304718+0.37304665786925073j)]; [(0.8583402784463595+0.13952117266781894j), (0.2138809231406249+0.6233427148017449j), (0.7276466404131303+0.6480559739625379j)]; [(0.1784816129006297+0.72452362541158j), (0.2870462766764591+0.8891190037142521j), (0.0980355896905617+0.03022344706473823j)]];
>> A = M^(-0.5);
>> format long
>> disp(A)
0.922112307438377 + 0.919346397931976i 0.108620882045523 - 0.649850434897895i -0.778737740194425 - 0.320654127149988i
-0.423384022626231 - 0.842737730824859i 0.592015668030645 + 0.661682656423866i 0.529361991464903 - 0.388343838121371i
-0.550789874427422 + 0.021129515921025i 0.472026152514446 - 0.502143106675176i 0.942976466768961 + 0.141839849623673i
>> cond(A)
ans =
3.429368520364765
Example in python:
>>> M = [[(0.8404782758300281+0.29389006737780765j), (0.741574080512219+0.7944606900644321j), (0.12788250870304718+0.37304665786925073j)], [(0.8583402784463595+0.13952117266781894j), (0.2138809231406249+0.6233427148017449j), (0.7276466404
... 131303+0.6480559739625379j)], [(0.1784816129006297+0.72452362541158j), (0.2870462766764591+0.8891190037142521j), (0.0980355896905617+0.03022344706473823j)]]
>>> A = fractional_matrix_power(M,-0.5)
>>> print(A)
[[ 0.92211231+0.9193464j 0.10862088-0.64985043j -0.77873774-0.32065413j]
[-0.42338402-0.84273773j 0.59201567+0.66168266j 0.52936199-0.38834384j]
[-0.55078987+0.02112952j 0.47202615-0.50214311j 0.94297647+0.14183985j]]
>>> np.linalg.cond(A)
3.4293685203647408
My suspicion is that if you scipy.io.loadmat the matrix into python, do the calculation, scipy.io.savemat the result and load it back in with MATLAB, you'll see less than 1e-12 absolute error (hopefully even less) between the results.
Theory #2 (the facepalm one)
My suspicion is that you're using python 2, and your -1/2-powered division is a simple inverse:
>>> # python 3 below
>>> # python 3's // is python 2's /, i.e. integer division
>>> 1/2
0.5
>>> 1//2
0
>>> -1/2
-0.5
>>> -1//2
-1
So if you're using python 2, then calling
fractional_matrix_power(M,-1/2)
is actually the inverse of M. The obvious solution is to switch to python 3. The less obvious solution is to keep using python 2 (which you shouldn't, as the above exemplifies), but use
from __future__ import division
on top of your every source file. This will override the behaviour of the simple / division operator so that it reflects the python 3 version, and you will have one less headache.

Getting a better answer from sympy inverse laplace transform

Trying to compute the following lines I'm getting a realy complex result.
from sympy import *
s = symbols("s")
t = symbols("t")
h = 1/(s**3 + s**2/5 + s)
inverse_laplace_transform(h,s,t)
The result is the following:
(-(I*exp(-t/10)*sin(3*sqrt(11)*t/10) - exp(-t/10)*cos(3*sqrt(11)*t/10))*gamma(-3*sqrt(11)*I/5)*gamma(-1/10 - 3*sqrt(11)*I/10)/(gamma(9/10 - 3*sqrt(11)*I/10)*gamma(1 - 3*sqrt(11)*I/5)) + (I*exp(-t/10)*sin(3*sqrt(11)*t/10) + exp(-t/10)*cos(3*sqrt(11)*t/10))*gamma(3*sqrt(11)*I/5)*gamma(-1/10 + 3*sqrt(11)*I/10)/(gamma(9/10 + 3*sqrt(11)*I/10)*gamma(1 + 3*sqrt(11)*I/5)) + gamma(1/10 - 3*sqrt(11)*I/10)*gamma(1/10 + 3*sqrt(11)*I/10)/(gamma(11/10 - 3*sqrt(11)*I/10)*gamma(11/10 + 3*sqrt(11)*I/10)))*Heaviside(t)
However the answer should be simpler, Wolframalpha proves it.
Is there any way to simplify this result?
I tried a bit with this one and the way I could find a simpler solution is using something like:
from sympy import *
s = symbols("s")
t = symbols("t", positive=True)
h = 1/(s**3 + s**2/5 + s)
inverse_laplace_transform(h,s,t).evalf().simplify()
Notice that I define t as a positive variable, otherwise the sympy function returns a large term followed by the Heaviaside function. The result still contains many gamma functions that I could not reduce to the expression returned by Wolfram. Using evalf() some of those are converted to their numeric value and then after simplification you get a expression similar like the one in Wolfram but with floating numbers.
Unfortunately this part of Sympy is not quite mature. I also tried with Maxima and the result is quite close to the one in Wolfram. So it seems that Wolfram is not doing anything really special there.

plus/minus operator for python ±

I am looking for a way to do a plus/minus operation in python 2 or 3. I do not know the command or operator, and I cannot find a command or operator to do this.
Am I missing something?
If you are looking to print the ± symbol, just use:
print(u"\u00B1")
Another possibility: uncertainties is a module for doing calculations with error tolerances, ie
(2.1 +/- 0.05) + (0.6 +/- 0.05) # => (2.7 +/- 0.1)
which would be written as
from uncertainties import ufloat
ufloat(2.1, 0.05) + ufloat(0.6, 0.05)
Edit: I was getting some odd results, and after a bit more playing with this I figured out why: the specified error is not a tolerance (hard additive limits as in engineering blueprints) but a standard-deviation value - which is why the above calculation results in
ufloat(2.7, 0.07071) # not 0.1 as I expected!
If you happen to be using matplotlib, you can print mathematical expressions similar as one would with Latex. For the +/- symbol, you would use:
print( r"value $\pm$ error" )
Where the r converts the string to a raw format and the $-signs are around the part of the string that is a mathematical equation. Any words that are in this part will be in a different font and will have no whitespace between them unless explicitly noted with the correct code. This can be found on the relavent page of the matplotlib documentation.
Sorry if this is too niche, but I stumbeled across this question trying to find this very answer.
Instead of computing expressions like
s1 = sqrt((125 + 10 * sqrt(19)) / 366)
s2 = sqrt((125 - 10 * sqrt(19)) / 366)
you could use
import numpy as np
pm = np.array([+1, -1])
s1, s2 = sqrt((125 + pm * 10 * sqrt(19)) / 366)
I think you want that for an equation like this;
Well there is no operator for that unless you don't use SymPy, only you can do is make an if statement and find each multiplier.
There is no such object in SymPy yet (as you saw, there is an issue suggesting one https://github.com/sympy/sympy/issues/5305). It's not hard to emulate, though. Just create a Symbol, and swap it out with +1 and -1 separately at the end. Like
pm = Symbol(u'±') # The u is not needed in Python 3. I used ± just for pretty printing purposes. It has no special meaning.
expr = 1 + pm*x # Or whatever
# Do some stuff
exprpos = expr.subs(pm, 1)
exprneg = expr.subs(pm, -1)
You could also just keep track of two equations from the start.
Instead of computing expressions like
s1 = sqrt((125.0 + 10.0*sqrt(19)) / 366.0)
s2 = sqrt((125.0 - 10.0*sqrt(19)) / 366.0)
you could use
r = 10.0*sqrt(19)
s1, s2 = (sqrt((125.0 + i) / 366.0) for i in (r, -r))
This is based on Nico's answer, but using a generator expression instead of NumPy
A plus/minus tolerance test can be done using a difference and absolute against the tolerance you wish to test for. Something like:
tst_data = Number you wish to test
norm = Target number
tolerance = Whatever the allowed tolerance is.
if abs(tst_data - norm) <= tolerance:
do stuff
Using the abs function allows the test to return a +/- within tolerance as True

Operations for Long and Float in Python

I'm trying to compute this:
from scipy import *
3600**3400 * (exp(-3600)) / factorial(3400)
the error: unsupported long and float
Try using logarithms instead of working with the numbers directly. Since none of your operations are addition or subtraction, you could do the whole thing in logarithm form and convert back at the end.
Computing with numbers of such magnitude, you just can't use ordinary 64-bit-or-so floats, which is what Python's core runtime supports. Consider gmpy (do not get the sourceforge version, it's aeons out of date) -- with that, math, and some care...:
>>> e = gmpy.mpf(math.exp(1))
>>> gmpy.mpz(3600)**3400 * (e**(-3600)) / gmpy.fac(3400)
mpf('2.37929475533825366213e-5')
(I'm biased about gmpy, of course, since I originated and still participate in that project, but I'd never make strong claims about its floating point abilities... I've been using it mostly for integer stuff... still, it does make this computation possible!-).
You could try using the Decimal object. Calculations will be slower but you won't have trouble with really small numbers.
from decimal import Decimal
I don't know how Decimal interacts with the scipy module, however.
This numpy discussion might be relevant.
Well the error is coming about because you are trying to multiply
3600**3400
which is a long with
exp(-3600)
which is a float.
But regardless, the error you are receiving is disguising the true problem. It seems exp(-3600) is too big a number to fit in a float anyway. The python math library is fickle with large numbers, at best.
exp(-3600) is too smale, factorial(3400) is too large:
In [1]: from scipy import exp
In [2]: exp(-3600)
Out[2]: 0.0
In [3]: from scipy import factorial
In [4]: factorial(3400)
Out[4]: array(1.#INF)
What about calculate it step by step as a workaround(and it makes sense
to check the smallest and biggest intermediate result):
from math import exp
output = 1
smallest = 1e100
biggest = 0
for i,j in izip(xrange(1, 1701), xrange(3400, 1699, -1)):
output = output * 3600 * exp(-3600/3400) / i
output = output * 3600 * exp(-3600/3400) / j
smallest = min(smallest, output)
biggest = max(biggest, output)
print "output: ", output
print "smallest: ", smallest
print "biggest: ", biggest
output is:
output: 2.37929475534e-005
smallest: 2.37929475534e-005
biggest: 1.28724174494e+214

Categories

Resources