The equivalence of Matlab sprand() in Python?

The equivalence of Matlab sprand() in Python? - python

I am trying to translate a Matlab code snippet into a Python one. However, I am not very sure how to correctly implement the sprand() function.
This is how the Matlab code use sprand():
% n_z is an integer, n_dw is a matrix
n_p_z_dw = cell(n_z, 1); % n(d,w) * p(z|d,w)
for z = 1:n_z
n_p_z_dw{z} = sprand(n_dw);
And this is how I implement the above logic in Python:
n_p_z_dw = [None]*n_z # n(d,w) * p(z|d,w)
density = np.count_nonzero(n_dw)/float(n_dw.size)
for i in range(0, n_z):
n_p_z_dw[i] = scipy.sparse.rand(n_d, n_w, density=density)
It seems to work, but I am not very sure about this. Any comment or suggestion?

The following should be a relatively fast way, I think, for a sparse array A:
import scipy.sparse as sparse
import numpy as np
sparse.coo_matrix((np.random.rand(A.nnz),A.nonzero()),shape=A.shape)
This will construct a COO format sparse matrix: it uses A.nonzero() as the coordinates, and A.nnz (the number of nonzero entries in A) to find the number of random numbers to generate.
I wonder, though, whether this might be a useful addition to the scipy.sparse.rand function.

Related

Convert 3D numpy Array in Python to 3D Matrix in Matlab [duplicate]

I am looking for a way to pass NumPy arrays to Matlab.
I've managed to do this by storing the array into an image using scipy.misc.imsave and then loading it using imread, but this of course causes the matrix to contain values between 0 and 256 instead of the 'real' values.
Taking the product of this matrix divided by 256, and the maximum value in the original NumPy array gives me the correct matrix, but I feel that this is a bit tedious.
is there a simpler way?

Sure, just use scipy.io.savemat
As an example:
import numpy as np
import scipy.io
x = np.linspace(0, 2 * np.pi, 100)
y = np.cos(x)
scipy.io.savemat('test.mat', dict(x=x, y=y))
Similarly, there's scipy.io.loadmat.
You then load this in matlab with load test.
Alteratively, as #JAB suggested, you could just save things to an ascii tab delimited file (e.g. numpy.savetxt). However, you'll be limited to 2 dimensions if you go this route. On the other hand, ascii is the universial exchange format. Pretty much anything will handle a delimited text file.

A simple solution, without passing data by file or external libs.
Numpy has a method to transform ndarrays to list and matlab data types can be defined from lists. So, when can transform like:
np_a = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
mat_a = matlab.double(np_a.tolist())
From matlab to python requires more attention. There is no built-in function to convert the type directly to lists. But we can access the raw data, which isn't shaped, but plain. So, we use reshape (to format correctly) and transpose (because of the different way MATLAB and numpy store data). That's really important to stress: Test it in your project, mainly if you are using matrices with more than 2 dimensions. It works for MATLAB 2015a and 2 dims.
np_a = np.array(mat_a._data.tolist())
np_a = np_a.reshape(mat_a.size).transpose()

Here's a solution that avoids iterating in python, or using file IO - at the expense of relying on (ugly) matlab internals:
import matlab
# This is actually `matlab._internal`, but matlab/__init__.py
# mangles the path making it appear as `_internal`.
# Importing it under a different name would be a bad idea.
from _internal.mlarray_utils import _get_strides, _get_mlsize
def _wrapper__init__(self, arr):
assert arr.dtype == type(self)._numpy_type
self._python_type = type(arr.dtype.type().item())
self._is_complex = np.issubdtype(arr.dtype, np.complexfloating)
self._size = _get_mlsize(arr.shape)
self._strides = _get_strides(self._size)[:-1]
self._start = 0
if self._is_complex:
self._real = arr.real.ravel(order='F')
self._imag = arr.imag.ravel(order='F')
else:
self._data = arr.ravel(order='F')
_wrappers = {}
def _define_wrapper(matlab_type, numpy_type):
t = type(matlab_type.__name__, (matlab_type,), dict(
__init__=_wrapper__init__,
_numpy_type=numpy_type
))
# this tricks matlab into accepting our new type
t.__module__ = matlab_type.__module__
_wrappers[numpy_type] = t
_define_wrapper(matlab.double, np.double)
_define_wrapper(matlab.single, np.single)
_define_wrapper(matlab.uint8, np.uint8)
_define_wrapper(matlab.int8, np.int8)
_define_wrapper(matlab.uint16, np.uint16)
_define_wrapper(matlab.int16, np.int16)
_define_wrapper(matlab.uint32, np.uint32)
_define_wrapper(matlab.int32, np.int32)
_define_wrapper(matlab.uint64, np.uint64)
_define_wrapper(matlab.int64, np.int64)
_define_wrapper(matlab.logical, np.bool_)
def as_matlab(arr):
try:
cls = _wrappers[arr.dtype.type]
except KeyError:
raise TypeError("Unsupported data type")
return cls(arr)
The observations necessary to get here were:
Matlab seems to only look at type(x).__name__ and type(x).__module__ to determine if it understands the type
It seems that any indexable object can be placed in the ._data attribute
Unfortunately, matlab is not using the _data attribute efficiently internally, and is iterating over it one item at a time rather than using the python memoryview protocol :(. So the speed gain is marginal with this approach.

scipy.io.savemat or scipy.io.loadmat does NOT work for matlab arrays --v7.3. But the good part is that matlab --v7.3 files are hdf5 datasets. So they can be read using a number of tools, including numpy.
For python, you will need the h5py extension, which requires HDF5 on your system.
import numpy as np, h5py
f = h5py.File('somefile.mat','r')
data = f.get('data/variable1')
data = np.array(data) # For converting to numpy array

Some time ago I faced the same problem and wrote the following scripts to allow easy copy and pasting of arrays back and forth from interactive sessions. Obviously only practical for small arrays, but I found it more convenient than saving/loading through a file every time:
Matlab -> Python
Python -> Matlab

Not sure if it counts as "simpler" but I found a solution to move data from a numpy arrray created in a python script which is called by matlab quite fast:
dump_reader.py (python source):
import numpy
def matlab_test2():
np_a = numpy.random.uniform(low = 0.0, high = 30000.0, size = (1000,1000))
return np_a
dump_read.m (matlab script):
clear classes
mod = py.importlib.import_module('dump_reader');
py.importlib.reload(mod);
if count(py.sys.path,'') == 0
insert(py.sys.path,int32(0),'');
end
tic
A = py.dump_reader.matlab_test2();
toc
shape = cellfun(#int64,cell(A.shape));
ls = py.array.array('d',A.flatten('F').tolist());
p = double(ls);
toc
C = reshape(p,shape);
toc
It relies on the fact that matlabs double seems be working efficiently on arrays compared to cells/matrices. Second trick is to pass the data to matlabs double in an efficient way (via pythons native array.array).
P.S. sorry for necroposting but I struggled a lot with its and this topic was one of the closest hits. Maybe it helps someone to shorten the time of struggling.
P.P.S. tested with Matlab R2016b + python 3.5.4 (64bit)

The python library Darr allows you to save your Python numpy arrays in a self-documenting and widely readable format, consisting of just binary and text files. When saving your array, it will include code to read that array in a variety of languages, including Matlab. So in essence, it is just one line to save your numpy array to disk in Python, and then copy-paste the code from the README.txt to load it into Matlab.
Disclosure: I wrote the library.

From MATLAB R2022a on, matlab.double (and matlab.int8, matlab.uint8, etc.) objects implement the buffer protocol. This means that you can pass them into NumPy array constructors. Construction in the opposite direction (which is the subject of the question here) is supported as well. That is, matlab objects can be constructed from objects that implement the buffer protocol. Thus, for instance, a matlab.double can be constructed from a NumPy double array.
UPDATE: Furthermore, from MATLAB R2022b on, objects that implement the buffer protocol (such as NumPy objects) can be passed directly into MATLAB functions that are called via Python. From the MATLAB Release Notes for R2022b, under the "External Language Interfaces" section:
import matlab.engine
import numpy
eng = matlab.engine.start_matlab()
buf = numpy.array([[1, 2, 3], [4, 5, 6]], dtype='uint16')
# Supported in R2022a and earlier: must initialize a matlab.uint16 from
# the numpy array and pass it to the function
array_as_matlab_uint16 = matlab.uint16(buf)
res = eng.sum(array_as_matlab_uint16, 1, 'native')
print(res)
# Supported as of R2022b: can pass the numpy array
# directly to the function
res = eng.sum(buf, 1, 'native')
print(res)

Let use say you have a 2D daily data with shape (365,10) for five years saved in np array np3Darrat that will have a shape (5,365,10). In python save your np array:
import scipy.io as sio #SciPy module to load and save mat-files
m['np3Darray']=np3Darray #shape(5,365,10)
sio.savemat('file.mat',m) #Save np 3D array
Then in MATLAB convert np 3D array to MATLAB 3D matix:
load('file.mat','np3Darray')
M3D=permute(np3Darray, [2 3 1]); %Convert numpy array with shape (5,365,10) to MATLAB matrix with shape (365,10,5)

In latest R2021a, you can pass a python numpy ndarray to double() and it will convert to a native matlab matrix, even when calling in console the numpy array it will suggest at the bottom "Use double function to convert to a MATLAB array"

Result of 3D FFT using pyculib is wrong

I use pyculib to perform 3D FFT on a matrix in Anaconda 3.5. I just followed the example code posted in the website. But I found something interesting and don't understand why.
Performing a 3D FFT on matrix with pyculib is correct only when using numpy.arange to create the matrix.
Here is the code:
from pyculib.fft.binding import Plan, CUFFT_C2C
import numpy as np
from numba import cuda
data = np.random.rand(26, 256, 256).astype(np.complex64)
orig = data.copy()
d_data = cuda.to_device(data)
fftplan = Plan.three(CUFFT_C2C, *data.shape)
fftplan.forward(d_data, d_data)
fftplan.inverse(d_data, d_data)
d_data.copy_to_host(data)
result = data / n
np.allclose(orig, result.real)
Finally, it turns out to be False. And the difference between orig and result is not a small number, not negligible.
I tried some other data sets (not random numbers), and get the some wrong results.
Also, I test without inverse FFT:
from pyculib.fft.binding import Plan, CUFFT_C2C
import numpy as np
from numba import cuda
from scipy.fftpack import fftn,ifftn
data = np.random.rand(26,256,256).astype(np.complex64)
orig = data.copy()
orig_fft = fftn(orig)
d_data = cuda.to_device(data)
fftplan = Plan.three(CUFFT_C2C, *data.shape)
fftplan.forward(d_data, d_data)
d_data.copy_to_host(data)
np.allclose(orig_fft, data)
The result is also wrong.
The test code on website, they use numpy.arange to create the matrix. And I tried:
n = 26*256*256
data = np.arange(n, dtype=np.complex64).reshape(26,256,256)
And the FFT result of this matrix is right.
Could anyone help to point out why?

I don't use CUDA, but I think your problem is numerical in nature. The difference lies in the two data sets you are using. random.rand has dynamic range of 0-1, and arange 0-26*256*256. The FFT attempts to resolve spatial frequency components on the order of range of values / number of points. For arange, this becomes unity, and the FFT is numerically accurate. For rand, this is 1/26*256*256 ~ 5.8e-7.
Just running FFT/IFFT on your numpy arrays without using CUDA shows similar differences.

Re-compose a Tensor after tensor factorization

I am trying to decompose a 3D matrix using python library scikit-tensor. I managed to decompose my Tensor (with dimensions 100x50x5) into three matrices. My question is how can I compose the initial matrix again using the decomposed matrix produced with Tensor factorization? I want to check if the decomposition has any meaning. My code is the following:
import logging
from scipy.io.matlab import loadmat
from sktensor import dtensor, cp_als
import numpy as np
//Set logging to DEBUG to see CP-ALS information
logging.basicConfig(level=logging.DEBUG)
T = np.ones((400, 50))
T = dtensor(T)
P, fit, itr, exectimes = cp_als(T, 10, init='random')
// how can I re-compose the Matrix T? TA = np.dot(P.U[0], P.U[1].T)
I am using the canonical decomposition as provided from the scikit-tensor library function cp_als. Also what is the expected dimensionality of the decomposed matrices?

The CP product of, for example, 4 matrices
can be expressed using Einstein notation as
or in numpy as
numpy.einsum('az,bz,cz,dz -> abcd', A, B, C, D)
so in your case you would use
numpy.einsum('az,bz->ab', P.U[0], P.U[1])
or, in your 3-matrix case
numpy.einsum('az,bz,cz->abc', P.U[0], P.U[1], P.U[2])
sktensor.ktensor.ktensor also have a method totensor() that does exactly this:
np.allclose(np.einsum('az,bz->ab', P.U[0], P.U[1]), P.totensor())
>>> True

See an explanation of CP here. You may also use tensorlearn package to rebuild the tensor.

Python Numpy - Complex Numbers - Is there a function for Polar to Rectangular conversion?

Is there a built-in Numpy function to convert a complex number in polar form, a magnitude and an angle (degrees) to one in real and imaginary components?
Clearly I could write my own but it seems like the type of thing for which there is an optimised version included in some module?
More specifically, I have an array of magnitudes and an array of angles:
>>> a
array([1, 1, 1, 1, 1])
>>> b
array([120, 121, 120, 120, 121])
And what I would like is:
>>> c
[(-0.5+0.8660254038j),(-0.515038074+0.8571673007j),(-0.5+0.8660254038j),(-0.5+0.8660254038j),(-0.515038074+0.8571673007j)]

There isn't a function to do exactly what you want, but there is angle, which does the hardest part. So, for example, one could define two functions:
def P2R(radii, angles):
return radii * exp(1j*angles)
def R2P(x):
return abs(x), angle(x)
These functions are using radians for input and output, and for degrees, one would need to do the conversion to radians in both functions.
In the numpy reference there's a section on handling complex numbers, and this is where the function you're looking for would be listed (so since they're not there, I don't think they exist within numpy).

There's an error in the previous answer that uses numpy.vectorize - cmath.rect is not a module that can be imported. Numpy also provides the deg2rad function that provides a cleaner piece of code for the angle conversion. Another version of that code could be:
import numpy as np
from cmath import rect
nprect = np.vectorize(rect)
c = nprect(a, np.deg2rad(b))
The code uses numpy's vectorize function to return a numpy style version of the standard library's cmath.rect function that can be applied element wise across numpy arrays.

I used cmath with itertools:
from cmath import rect,pi
from itertools import imap
b = b*pi/180 # convert from deg to rad
c = [x for x in imap(rect,a,b)]

import numpy as np
import cmath.rect
nprect = np.vectorize(rect)
c = nprect(a,b*np.pi/180)

tom10 answer works fine... you can also expand the Euler's formula to:
def P2R(A, phi):
return A * ( np.cos(phi) + np.sin(phi)*1j )

"Converting" Numpy arrays to Matlab and vice versa

I am looking for a way to pass NumPy arrays to Matlab.
I've managed to do this by storing the array into an image using scipy.misc.imsave and then loading it using imread, but this of course causes the matrix to contain values between 0 and 256 instead of the 'real' values.
Taking the product of this matrix divided by 256, and the maximum value in the original NumPy array gives me the correct matrix, but I feel that this is a bit tedious.
is there a simpler way?

Sure, just use scipy.io.savemat
As an example:
import numpy as np
import scipy.io
x = np.linspace(0, 2 * np.pi, 100)
y = np.cos(x)
scipy.io.savemat('test.mat', dict(x=x, y=y))
Similarly, there's scipy.io.loadmat.
You then load this in matlab with load test.
Alteratively, as #JAB suggested, you could just save things to an ascii tab delimited file (e.g. numpy.savetxt). However, you'll be limited to 2 dimensions if you go this route. On the other hand, ascii is the universial exchange format. Pretty much anything will handle a delimited text file.

A simple solution, without passing data by file or external libs.
Numpy has a method to transform ndarrays to list and matlab data types can be defined from lists. So, when can transform like:
np_a = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
mat_a = matlab.double(np_a.tolist())
From matlab to python requires more attention. There is no built-in function to convert the type directly to lists. But we can access the raw data, which isn't shaped, but plain. So, we use reshape (to format correctly) and transpose (because of the different way MATLAB and numpy store data). That's really important to stress: Test it in your project, mainly if you are using matrices with more than 2 dimensions. It works for MATLAB 2015a and 2 dims.
np_a = np.array(mat_a._data.tolist())
np_a = np_a.reshape(mat_a.size).transpose()

Here's a solution that avoids iterating in python, or using file IO - at the expense of relying on (ugly) matlab internals:
import matlab
# This is actually `matlab._internal`, but matlab/__init__.py
# mangles the path making it appear as `_internal`.
# Importing it under a different name would be a bad idea.
from _internal.mlarray_utils import _get_strides, _get_mlsize
def _wrapper__init__(self, arr):
assert arr.dtype == type(self)._numpy_type
self._python_type = type(arr.dtype.type().item())
self._is_complex = np.issubdtype(arr.dtype, np.complexfloating)
self._size = _get_mlsize(arr.shape)
self._strides = _get_strides(self._size)[:-1]
self._start = 0
if self._is_complex:
self._real = arr.real.ravel(order='F')
self._imag = arr.imag.ravel(order='F')
else:
self._data = arr.ravel(order='F')
_wrappers = {}
def _define_wrapper(matlab_type, numpy_type):
t = type(matlab_type.__name__, (matlab_type,), dict(
__init__=_wrapper__init__,
_numpy_type=numpy_type
))
# this tricks matlab into accepting our new type
t.__module__ = matlab_type.__module__
_wrappers[numpy_type] = t
_define_wrapper(matlab.double, np.double)
_define_wrapper(matlab.single, np.single)
_define_wrapper(matlab.uint8, np.uint8)
_define_wrapper(matlab.int8, np.int8)
_define_wrapper(matlab.uint16, np.uint16)
_define_wrapper(matlab.int16, np.int16)
_define_wrapper(matlab.uint32, np.uint32)
_define_wrapper(matlab.int32, np.int32)
_define_wrapper(matlab.uint64, np.uint64)
_define_wrapper(matlab.int64, np.int64)
_define_wrapper(matlab.logical, np.bool_)
def as_matlab(arr):
try:
cls = _wrappers[arr.dtype.type]
except KeyError:
raise TypeError("Unsupported data type")
return cls(arr)
The observations necessary to get here were:
Matlab seems to only look at type(x).__name__ and type(x).__module__ to determine if it understands the type
It seems that any indexable object can be placed in the ._data attribute
Unfortunately, matlab is not using the _data attribute efficiently internally, and is iterating over it one item at a time rather than using the python memoryview protocol :(. So the speed gain is marginal with this approach.

scipy.io.savemat or scipy.io.loadmat does NOT work for matlab arrays --v7.3. But the good part is that matlab --v7.3 files are hdf5 datasets. So they can be read using a number of tools, including numpy.
For python, you will need the h5py extension, which requires HDF5 on your system.
import numpy as np, h5py
f = h5py.File('somefile.mat','r')
data = f.get('data/variable1')
data = np.array(data) # For converting to numpy array

Some time ago I faced the same problem and wrote the following scripts to allow easy copy and pasting of arrays back and forth from interactive sessions. Obviously only practical for small arrays, but I found it more convenient than saving/loading through a file every time:
Matlab -> Python
Python -> Matlab

Not sure if it counts as "simpler" but I found a solution to move data from a numpy arrray created in a python script which is called by matlab quite fast:
dump_reader.py (python source):
import numpy
def matlab_test2():
np_a = numpy.random.uniform(low = 0.0, high = 30000.0, size = (1000,1000))
return np_a
dump_read.m (matlab script):
clear classes
mod = py.importlib.import_module('dump_reader');
py.importlib.reload(mod);
if count(py.sys.path,'') == 0
insert(py.sys.path,int32(0),'');
end
tic
A = py.dump_reader.matlab_test2();
toc
shape = cellfun(#int64,cell(A.shape));
ls = py.array.array('d',A.flatten('F').tolist());
p = double(ls);
toc
C = reshape(p,shape);
toc
It relies on the fact that matlabs double seems be working efficiently on arrays compared to cells/matrices. Second trick is to pass the data to matlabs double in an efficient way (via pythons native array.array).
P.S. sorry for necroposting but I struggled a lot with its and this topic was one of the closest hits. Maybe it helps someone to shorten the time of struggling.
P.P.S. tested with Matlab R2016b + python 3.5.4 (64bit)

The python library Darr allows you to save your Python numpy arrays in a self-documenting and widely readable format, consisting of just binary and text files. When saving your array, it will include code to read that array in a variety of languages, including Matlab. So in essence, it is just one line to save your numpy array to disk in Python, and then copy-paste the code from the README.txt to load it into Matlab.
Disclosure: I wrote the library.

From MATLAB R2022a on, matlab.double (and matlab.int8, matlab.uint8, etc.) objects implement the buffer protocol. This means that you can pass them into NumPy array constructors. Construction in the opposite direction (which is the subject of the question here) is supported as well. That is, matlab objects can be constructed from objects that implement the buffer protocol. Thus, for instance, a matlab.double can be constructed from a NumPy double array.
UPDATE: Furthermore, from MATLAB R2022b on, objects that implement the buffer protocol (such as NumPy objects) can be passed directly into MATLAB functions that are called via Python. From the MATLAB Release Notes for R2022b, under the "External Language Interfaces" section:
import matlab.engine
import numpy
eng = matlab.engine.start_matlab()
buf = numpy.array([[1, 2, 3], [4, 5, 6]], dtype='uint16')
# Supported in R2022a and earlier: must initialize a matlab.uint16 from
# the numpy array and pass it to the function
array_as_matlab_uint16 = matlab.uint16(buf)
res = eng.sum(array_as_matlab_uint16, 1, 'native')
print(res)
# Supported as of R2022b: can pass the numpy array
# directly to the function
res = eng.sum(buf, 1, 'native')
print(res)

Let use say you have a 2D daily data with shape (365,10) for five years saved in np array np3Darrat that will have a shape (5,365,10). In python save your np array:
import scipy.io as sio #SciPy module to load and save mat-files
m['np3Darray']=np3Darray #shape(5,365,10)
sio.savemat('file.mat',m) #Save np 3D array
Then in MATLAB convert np 3D array to MATLAB 3D matix:
load('file.mat','np3Darray')
M3D=permute(np3Darray, [2 3 1]); %Convert numpy array with shape (5,365,10) to MATLAB matrix with shape (365,10,5)

In latest R2021a, you can pass a python numpy ndarray to double() and it will convert to a native matlab matrix, even when calling in console the numpy array it will suggest at the bottom "Use double function to convert to a MATLAB array"

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

The equivalence of Matlab sprand() in Python? - python

Related

Convert 3D numpy Array in Python to 3D Matrix in Matlab [duplicate]

Result of 3D FFT using pyculib is wrong

Re-compose a Tensor after tensor factorization

Python Numpy - Complex Numbers - Is there a function for Polar to Rectangular conversion?

"Converting" Numpy arrays to Matlab and vice versa

Categories

Resources