I have a problem in python where i would like to merge some sparse matrices into one. The sparse matrices are of csr_matrix type and have same amount of rows. When I use hstack to stack them together I obtain an array of matrices, but I would like to obtain a single matrix with the number of rows (which is the same for every matrix) and as the number of columns the sum of the columns number of every matrix.
Thanks for support.
You can do this using scipy.sparse.hstack. For example:
import numpy as np
from scipy import sparse
x = sparse.csr_matrix(np.random.randint(0, 2, size=(10, 10)))
y = sparse.csr_matrix(np.random.randint(0, 2, size=(10, 10)))
xy = sparse.hstack([x, y])
print(xy.shape)
# (10, 20)
print(type(xy))
# <class 'scipy.sparse.coo.coo_matrix'>
Related
I want to perform a elementwise-multiplication of two (scipy) sparse matrices: A.shape = B.shape = (m,n). However, matrix B consists of a smaller matrix B_base which is stacked horizontally. Obviously, this is is not memory-efficient. Thus, the question: How can I efficiently multiply A and B_base elementwise without stacking?
Below find a MWE using sparse.hstack:
from scipy import sparse
A = sparse.random(m=1000, n=10000, density=0.1, format="csc")
B = sparse.random(m=1000, n=1000, density=0.1, format="csc")
factor_matrix = sparse.hstack([B for i in range(10)], format="csc")
result = A.multiply(factor_matrix)
I have two sparse matrices in pythons sparse package. See below:
import sparse
total_coords1 = [(0,1,1,2), (0,0,2,3), (0,1,2,2)]
data1 = [1,1,1,1]
s1 = sparse.COO(total_coords1, data1, shape=(7, 5, 12))
total_coords2 = [(0,1,2,3), (0,1,1,2), (0,1,2,2)]
data2 = [2,2,2,2]
s2 = sparse.COO(total_coords1, data1, shape=(7, 5, 15))
I want to combine these two sparse matrices into a single sparse matrix along the last axis (axis=2). something like:
s3 = sparse.COO(s1, s2)
Since you did not mention the axis along which you want to concatenate, I will assume axis=2, as it is the only possible axis along which we can concatenate the given arrays.
You can use concatenate function to get a single sparse matrix of shape (7, 5, 27):
s3 = sparse.concatenate([s1,s2], axis=2)
I have an array of 2x2 complex matrices that represents a transformation of a scattering matrix over time. For my calculations I need a way to multiply such arrays between themselves (matrix multiplication); multiply each matrix in the array by another matrix; apply a transformation to all matrices in the array.
I've tried multiple ways of doing so with the numpy (4 column array, array of arrays, array of matrices, list of matrices), but each of them, while providing a nice interface for some of the required functions, makes the rest very awkward.
So here's the question - what is the best way to represent such structures and how would I carry out the required transformations over them?
examples
Initially data is in csv file:
import numpy as np
csv = np.arange(45.).reshape(5,9)
t = np.array(csv[:,0]) # time array
4 column array
transform csv to 4 column array:
data = np.apply_along_axis(lambda x: [x[1]+1j*x[2],
x[3]+1j*x[4],
x[5]+1j*x[6],
x[7]+1j*x[8]],1,csv)
array x matrix:
m = np.array([[1,0],[0,0]])
np.apply_along_axis(lambda x: (x.reshape(2,2).dot(m)).reshape(1,4),1,data)
array x array:
would probably require a for loop and array preallocation
transformation:
np.apply_along_axis(lambda x: [-(x[0]*x[3]-x[1]*x[2])/x[2],
x[0]/x[2],
-x[3]/x[2],
1/x[2]],1,data)
list of arrays
transform csv to list of arrays:
data = [np.array([[i[1]+1j*i[2],
i[3]+1j*i[4]],
[i[5]+1j*i[6],
i[7]+1j*i[8]]]) for i in csv]
array x matrix:
m = np.array([[1,0],[0,0]])
[i.dot(m) for i in data]
array x array:
[data[i].dot(data[i]) for i in range(len(data))]
transformation:
[np.array([[-(np.linalg.det(x))/x[0,1],
x[0,0]/x[1,0]],
[-x[1,1]/x[0,1],
1/x[0,1]]]) for x in data]
array of matrices
transform csv to array of matrices:
data = np.apply_along_axis(lambda x: [[x[1]+1j*x[2],
x[3]+1j*x[4]],
[x[5]+1j*x[6],
x[7]+1j*x[8]]],1,csv)
array x matrix:
m = np.array([[1,0],[0,0]])
data.dot(m)
array x array:
would probably require a for loop and array preallocation
data * data # not a dot product
transformation:
would probably require a for loop and array preallocation
I am using Python with numpy, scipy and scikit-learn module.
I'd like to classify the arrays in very big sparse matrix. (100,000 * 100,000)
The values in the matrix are equal to 0 or 1. The only thing I have is the index of value = 1.
a = [1,3,5,7,9]
b = [2,4,6,8,10]
which means
a = [0,1,0,1,0,1,0,1,0,1,0]
b = [0,0,1,0,1,0,1,0,1,0,1]
How can I change the index array to the sparse array in scipy ?
How can I classify those array quickly ?
Thank you very much.
If you choose the sparse coo_matrix you can create it passing the indices like:
from scipy.sparse import coo_matrix
import scipy
nrows = 100000
ncols = 100000
row = scipy.array([1,3,5,7,9])
col = scipy.array([2,4,6,8,10])
values = scipy.ones(col.size)
m = coo_matrix((values, (row,col)), shape=(nrows, ncols), dtype=float)
I'm looking for dynamically growing vectors in Python, since I don't know their length in advance. In addition, I would like to calculate distances between these sparse vectors, preferably using the distance functions in scipy.spatial.distance (although any other suggestions are welcome). Any ideas how to do this? (Initially, it doesn't need to be efficient.)
Thanks a lot in advance!
You can use regular python lists (which are dynamic) as vectors. Trivial example follows.
from scipy.spatial.distance import sqeuclidean
a = [1,2,3]
b = [0,0,0]
print sqeuclidean(a,b) # 14
As per aganders3's suggestion, do note that you can also use numpy arrays if needed:
import numpy
a = numpy.array([1,2,3])
If the sparse part of your question is crucial I'd use scipy for that - it has support for sparse matrixes. You can define a 1xn matrix and use it as a vector. This works (the parameter is the size of the matrix, filled with zeroes by default):
sqeuclidean(scipy.sparse.coo_matrix((1,3)),scipy.sparse.coo_matrix((1,3))) # 0
There are many kinds of sparse matrixes, some dictionary based (see comment). You can define a row sparse matrix from a list like this:
scipy.sparse.csr_matrix([1,2,3])
Here is how you can do it in numpy:
import numpy as np
a = np.array([1, 2, 3])
b = np.array([0, 0, 0])
c = np.sum(((a - b) ** 2)) # 14